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CUMULATIVE TEST RECORDS: THEIR NATURE 
AND USES 


ARTHUR E. TRAXLER 
Educational Records Bureau 


OST SCHOOLS now recognize certain values in ob- 
jective tests of academic aptitude and achievement and 
employ such tests to some extent in the appraisal, placement, 
instruction, and guidance of their pupils. It is generally 
understood that objective tests do not measure all educational 
outcomes, but studies have repeatedly demonstrated that they 
do measure certain aspects of ability and achievement that are 
important in the scholastic success of individual boys and 
girls. 

It may seem merely a reiteration of an obvious point to 
say that the value of a testing program is directly proportional 
to the nature and extent of the uses of the results by the 
faculty and students. Nevertheless, emphasis on this point is 
necessary, for not infrequently school authorities administer a 
series of tests, file the scores, and then give no further atten- 
tion to the test data. When no improvement is noted, they 
blame the tests when the real fault lies with their failure to 
study the results. Tests are not in themselves remedial instru- 
ments; they are tools which can be indispensable aids to diag- 
nosis and thus form an important basis for the planning of 
instruction and guidance, provided someone carefully and in- 
telligently studies the data which they provide. The analysis 
of the test results may to some extent be concerned with 
groups, but it should deal primarily with individuals. 

Before test results can be studied and used to best advan- 
tage they must be recorded in some convenient form. Alpha- 
betical class lists of the scores and percentile ranks of indi- 
vidual pupils, accompanied by sheets showing the distributions 
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of scores for various classes, are very useful to teachers for 
purposes of a quick survey of the results for individuals or 
groups with reference to restricted areas of ability or achieve- 
ment. They do not, however, readily provide a compre- 
hensive picture of the results of all tests taken by any one 
individual. 

Individual record sheets on which the results from a single 
testing program are summarized in tabular or graphic form 
provide a very helpful picture of the status of a pupil at any 
one time. One can, for example, administer a general achieve- 
ment test battery such as the Metropolitan or the Stanford, 
plot the achievement profile of each pupil, and thus obtain 
a graphic representation of strengths and weaknesses that 
greatly simplifies the problem of diagnosis. A graph of this 
type is illustrated in Figure 1. One can see at a glance that in 
comparison with the grade norms, this pupil is strong in read- 
ing, literature, history and civics, and geography, but relatively 
weak in arithmetic and spelling. 

However, distributions, class lists, and diagnostic profiles 
resulting from one testing program share a common limita- 
tion. They show status, but they do not show growth. For 
both instruction and guidance, the concept of growth—how 
far a pupil has come within a certain period and how far 
he should be able to go—is probably fully as important as the 
concept of present status. 

Now, there is a type of record that provides evidence about 
both status at any testing period and growth between testing 
periods. This is the individual cumulative record. It is un- 
questionably the most valuable aid to the intelligent and effi- 
cient use of test results yet devised. It is to other kinds of 
records what a motion picture is to a snap shot. 






















the tests in the Intermediate Advanced Batteries. The headings at 
and Primary II Batteries. as indicated. In order to use the 


testa in the Primary I 


chart refer to 


The cumulative record presupposes a regular, systematic 
testing program. If tests are administered in a school at irreg- 
ular intervals and without definite plan, the value of a 
record of this kind will be greatly curtailed, but even under 
these conditions, it will probably prove more useful than any 
other kind of record of test results. 
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Explanation and Illustration of Cumulative Test Records 

Any record for individual pupils that provides for succes- 
sive additions of the same type of data over a period of 
years and thus makes possible a study of the changes that have 
taken place may be called a cumulative record. Thus, a tabu- 
lar arrangement of test scores and percentiles by years is cumu- 
lative in nature. Test results entered in this way, however, 
cannot readily be apprehended quickly, but require detailed 
study. This fact has led many persons to favor the graphic 
representation of test results. Among the various types of 
graphs thus far devised, the gridiron percentile graph first 
employed in the American Council cumulative record forms 
stands out as the most widely used type. Its success is no 
doubt due partly to the fact that it will accommodate any 
kind of test result that can be expressed in terms of percentiles. 

One of the best known adaptations of the American Coun- 
cil form is the Educational Records Bureau cumulative record 
for independent schools. This form, like the American Coun- 
cil form, is planned for six years, but it may be expanded to 
include any number of years. The test portion of this type 
of record is illustrated in Figures 2, 3, and 4. Let us note in 
some detail the nature of the information provided. 

The card is divided by heavy vertical lines into broad 
columns, each of which represents a grade or a year in the 
life of the pupil. The year and grade are indicated at the 
top of each column. 

The front of the card is devoted almost entirely to a rec- 
ord of class work and to an extensive test record. Since 
the main purpose of this article is to discuss test records, the 
portion of the sample forms dealing with subjects, marks, and 
credits has not been filled out. The test results are reported 
in both tabular and graphic form. The scores and correspond- 
ing percentiles are entered in the table and the percentiles are 
then used as the basis of the graph, which occupies approxi- 
mately the lower half of the card. 

The graph of test scores is the clearest phase of the record 
to one familiar with graphs of this kind, but it often seems 
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somewhat puzzling to persons who have had no experience 
with it. The percentiles along the scale at the left are placed 
according to standard deviations in a normal distribution, and 
thus the distance between successive percentiles is much smaller 
near the median than near the extremes. The median, or 
50th percentile, is marked by the heavy line going horizontally 
across the graph. The symbols at the top—Jy, Au, S, O, and 
so forth—stand for the months of the year. The months are 
grouped according to the school year rather than the calendar 
year. 

The same percentile data that are shown in the table of 
scores are entered in the graph, except that to prevent over- 
crowding, the percentiles for the parts of the English test 
have been omitted from the graph. The percentiles used in 
these records are based on results in independent schools, but 
the interpretation of public-school percentile ratings would 
be made in exactly the same way. 

The small dots on the graph show the placement of the 
various percentiles, the dots being identified by the abbreviated 
names of the tests printed near them. For example, in Figure 
2, the dot toward the top of the graph is labeled “French” 
to indicate that it stands for the percentile on the Cooperative 
French test. The percentile for the pupil’s total score of 61 
on the French test is 93, and this is indicated by placing the 
dot opposite 93, one of the points shown on the percentile 
scale at the left of the chart. In other words, the pupil’s 
French score was above the scores of 93 per cent of the inde- 
pendent-school ninth-grade first-year French students who took 
the test in the spring of 1941. 

The percentile points for tests that are in the same field 
from year to year are connected by lines, so that one can 
readily follow a particular type of achievement throughout the 
whole period covered by the test. For instance, one of the 
lines in Figure 3 runs from the arithmetic percentile in Grade 
6 to the arithmetic percentile in Grade 7, and from that point 
to the arithmetic percentile in Grade 8, thence to the ele- 
mentary algebra percentile in Grade 9, et cetera. Achieve- 
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ment percentiles are connected with solid lines, academic apti- 
tude percentiles with broken lines, and chronological age per- 
centiles with dotted lines. 

The record shown in Figure 2, that of Edwin Martin,’ 
covers only one year. In many independent schools, a consid- 
erable proportion of the records will be of this type, for the 
number of one-year students attending private schools tends 
to be fairly large. Even in the case of single-year records, 
it is desirable to record the data on a cumulative record form, 
for such a procedure facilitates comparisons between academic 
aptitude and achievement and makes it possible to summarize . 
readily the student’s test record for the year as a whole. | 

Figure 2 shows that in the fall of 1940, Edwin was close 
to, but slightly below average for his grade in chronological 
age and that he was somewhat below the median in academic 
aptitude, as indicated by the results of the American Council 
Psychological Examination, and in reading, as measured by 
the Nelson-Denny Test. These results are recorded directly 
below the letter O, which shows that the data were obtained 
in October. 

The spring, 1941, percentiles are entered beneath the let- 
ter 4, and thus one knows that the tests were given in April. 
Edwin seems to be an able student of foreign language. As 
already indicated, his French score was above those of all but 
7 per cent of the first-year French students in Grade 9. His 
total score on the Latin test fell within the highest third of 
the independent-school ninth-grade first-year group. 

In science and elementary algebra, the boy was above the 
independent-school median but not outstanding. His total 
| English percentile and his literary acquaintance percentile were 
| below the median but above his academic aptitude percentile. 
} In general, Edwin’s achievement test percentiles were some- 
what higher than his percentiles in academic aptitude and 
} reading. This is, of course, an encouraging finding, for it 
; indicates that, presumably because of application and. hard 











1These are actual test records, but the names of the pupils and the schools 
are fictitious. 
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work, the boy’s achievement record near the end of the school 
year was better than one would expect it to be on the basis 
of the fall test results. 

Let us now examine a cumulative record covering several 
years. The test record of Betty A. Stetson, as shown in Fig- 
ure 3, is that of a girl who was a little younger than the 
average pupil in her grade but who was generally high in both 


academic aptitude and achievement. Her Otis intelligence _ 


test percentiles in Grade 6 were exceptionally high. Her 
later academic aptitude percentiles were a little lower, but all 
of them were significantly above the median for her grade. In 
fact, her Otis scores in Grades 6 and 7 and her scores on the 
American Council Psychological Examination in Grades 9 and 
10 were in the highest tenth of the scores made by the inde- 
pendent-school pupils at the same grade levels. 

Betty’s achievement test percentiles were, in general, some- 
what lower than her percentiles in academic aptitude, but most 
of them were in the upper half of the scores of the pupils 
in her grade. The only achievement percentiles below the 
median were those for spelling in Grade 6, geography in 
Grade 7, arithmetic in Grade 8, general science in Grade 10, 
and modern European history in Grade 11. The history score 
was the only very low result in the entire record. 

This girl is obviously an excellent reader. On the reading 
tests, she maintained a position within the highest tenth of her 
grade throughout the entire period. 

In a graphic record of this kind, growth in any subject 
precisely equal to the growth of the group as a whole in that 
subject is shown by an exactly horizontal line. That is, if a 
pupil improves just as much as the group improves in a year, 
he will maintain the same percentile rating from one year to 
the next. Lines which go upward, then, indicate greater than 
average growth and lines which slope downward suggest less 
than average growth. In interpreting such variations, how- 
ever, one should keep in mind the fact that every test involves 
a certain amount of sampling error and that the population on 
which the percentiles are based is not exactly the same from 
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year to year. So some variation in percentiles for the same 
subject is to be expected even when growth is normal. One 
should also remember that tests of different subjects in the 
same field—for example, algebra and plane geometry or biol- 
ogy and chemistry—involve somewhat different abilities. In 
these instances, changes in percentile rating should not be 
interpreted in terms of growth. 

However, marked gains or losses in percentiles on the same 
test, such as an English test or a foreign language test, are 
symptomatic, and they may be indicative of a need for coun- 
seling attention in order to find the reason for the variation. 
For example, the steady downward trend of Betty’s per- 
centiles in French would seem to require an explanation that 
cannot be obtained from the record itself. 

On the whole, the seven-year record shown in Figure 3 
indicates definitely superior ability and achievement. A coun- 
selor, school principal, or college admissions officer familiar 
with this type of record could decide almost at a glance that, 
as far as aptitude and attainment are concerned, this girl 
should do well even in a highly selective college. 

The record of Charles W. Loring, shown in Figure 4, also 
includes Grades 6 to 12, inclusive, but it covers a period of 
eight years rather than seven, since this boy repeated the 
eighth grade. The general level of the percentiles is in marked 
contrast to that of the percentiles on the preceding record. 
This is an over-age pupil who is rather low in both academic 
aptitude and achievement in comparison with the average for 
his grade. Because he is advanced in chronological age, the 
percentiles corresponding to mental age and raw scores on 
intelligence tests tend to be somewhat higher than the per- 
centiles for I.Q., but with one exception they are below the 
independent-school median. 

In Grade 6, all but one of Charles’ scores on the New 
Stanford Achievement Test were distributed below the median 
for independent-school pupils at that grade level. The next 
year, most of his achievement test percentiles went upward to 
some extent, but the percentiles for language usage and arith- 
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metic were the only ones above the independent-school median 
for Grade 7. In the following years, nearly all his percentiles 
were below the median for his grade, although some of them 
were not far below it. 

There -is some evidence that this boy had read rather 
widely and that he was a fairly competent reader. Three 


of his literature percentiles and three of his reading percentiles | 


were above the median for his grade. 

Charles’ repetition of Grade 8 did not significantly raise 
his subsequent record on the achievement tests. The only line 
that went upward noticeably as a result of the repetition of 


this grade was the one for chronological age. The whole § 


record is that of a pupil who probably should not attempt 
to enter the usual liberal arts college after graduation from 
the secondary school. Rather, he needs guidance into prepara- 
tion for some type of vocation the demands of which are 
consistent with his mediocre scholastic attainments. 

It will be observed that there is a general consistency in 
the test results throughout both illustrative records. The girl 
whose record is shown in Figure 3 was high in academic apti- 
tude and achievement in the elementary school and she main- 
tained this superiority throughout the secondary school. The 
boy whose record is contained in Figure 4 was low in academic 
aptitude and achievement in the elementary school and this 
low record was continued in the secondary school. In both 
cases, the general level of the percentile ratings in Grade 12 
could have been predicted from the results of the achievement 
tests taken in Grade 6. 

The tendency of the cumulative record of test results for 
an individual pupil to be in agreement from year to year is 
one of the most noteworthy aspects of this type of record. 
This tendency is verified by hundreds of such records which 
have been prepared at the Educational Records Bureau and 
other institutions. While the percentiles on an occasional test 
may vary markedly in successive years, the whole picture of 
a pupil’s record tends to remain much the same. This is 
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usually true regardless of transfer from one school to another 
or variation in type of instruction. Comprehensive test rec- 
ords obtained even as low as the second or third grade often 
predict with remarkable fidelity the level of achievement a 
pupil will attain in his senior year of high school. The fact 
that test results distributed over a long period of time tend 
to be positively correlated causes cumulative test histories to 
have exceptional potential guidance values. 


Notwithstanding the general tendency just indicated, it 
is true that a pupil’s test record for an entire year may some- 
times be decidedly out of line with his scores in other years. 
When this happens, an explanation that can be made only in 
the light of much other information about the pupil is re- 
quired. Consequently, a cumulative record of other kinds of 
data is needed if one is to make an adequate interpretation 
of the test record. For this reason and many other reasons, 
it is advisable for schools to maintain cumulative records that 
cover not only test results but that include home background, 
class work, interests and activities, personality adjustment, and 
various other factors. The interrelationships of the different 
kinds of information that can be recorded on a form similar 
to the American Council card are brought out by the record 
for Harry Connelly, as shown in Figure 5. 

It is evident from Figure 5 that this boy tended to be below 
the independent-school median in his scores on the Metro- 
politan Achievement Test taken in Grade 8, but that he was 
consistently above the median in scores on all the achievement 
tests taken in Grades 9 to 12. He was especially high in 
English, literary acquaintance, and science. The boy’s superior 
test record in the four high-school grades agrees with his con- 
sistently high percentile ratings on the academic aptitude tests. 

It appears that an explanation of the marked difference 
between Harry’s test scores in Grade 8 and his test scores 
in the later years is to be found in the data entered on the 
back of the card. Although he was obviously bright, he was 
lazy and disorderly in the eighth grade and it is probable 
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that he lacked both the preparation and the interest in the test 
itself that would be required for high scores on the Metro- 
politan test, or any other test of general achievement. There 
was much improvement in the boy’s behavior and attitude 
in the ninth grade and in subsequent grades, and consequently 
_ his achievement increased until it was proportionate to his 
ability. The whole picture is that of an intelligent, able boy 
who was immature in behavior and attitude but who became 
much more mature during the secondary-school years. At the 
end of the secondary school, he unquestionably had the ability 
— and the preparation for better than average college work. 
Experience indicates that a record of this kind furnishes a far 
better basis for prognosis of college success than is provided 
by a transcript of credits and an admission form filled out 
| by the school when the pupil is near the end of his secondary- 
= school course. 


Cumulative records in terms of percentiles are not the 
only kind of graphic record of test results that can be used. 
If the results of all tests employed in a school’s program 
are expressed in terms of standard scores, Scaled Scores, or 
— some other comparable unit, the data may be graphed on that 
basis. Such units are sometimes preferable to percentiles for 
purposes of showing growth and if they take their origin from 
a common standardization group, as do the Scaled Scores 
of the Cooperative Test Service, the influence cf the selective 
— factor found in certain subjects, such as the foreign languages, 
is obviated.? 

It should be clearly understood that cumulative records 
of test results can be kept without the use 6f any graph what- 
soever. The preparation of the graphic part of the record is, 
me of course, a time-consuming clerical job. While it is a dis- 
tinct aid to interpretation, schools in which the time and cost 











2For an illustration of a cumulative record based on Scaled Scores see John 
C. Flanagan, The Cooperative Achievement Tests: A Bulletin Reporting the 
Basic Principles and Procedures Used in the Development of Their System of 
Scaled Scores, p. 37. New York: The Cooperative Test Service, December, 1939. 
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of the graph would be prohibitive can maintain usable test 
records in tabular form. The main thing is to record the data 
in organized fashion so that trends can be discerned. 


Uses of Cumulative Test Records 


As already indicated, the uses that are made of cumulative 
test records depend largely on the interest, initiative, and 
understanding of the administration and faculty in each local 
school. Among the possible uses of such records are the 
following : 

1. Counselors may use cumulative test records in con- 
ferring with pupils and guiding them toward educational and 
vocational choices consistent with their ability and achievement. 

2. Teachers may study them in order to plan their instruc- 
tion to accord with the aptitude, knowledge, and understand- 
ing of the individuals in their classes. 


3. Administrators and personnel officers may refer to 
them when conferring with parents about their children.® 


4. Principals and guidance directors may take them into 
consideration when recommending graduates to colleges or to 
prospective employers. 

5. College admissions officers may use them as one type 
of evidence on which decisions about admitting applicants are 
based. In order to conserve the time of the college admis- 
sions officer, the school should of course include a paragraph 
of interpretation when the record is sent to the college. Ad- 
missions officers expect to receive from the school an estimate 
of a candidate’s fitness and they will place more credence in 
the estimate when it is based in part upon tangible information. 

6. Schools may employ them in placing transfer pupils in 
courses to which they are suited. 


7. Administrative officers and department heads may use 
them in sectioning classes on the basis of ability. 





3An excellent discussion of this type of use is given in Robert N. Hilkert, 
“Parents and Cumulative Records,” Educational Record, Supplement No. 13, pp. 
172-83. Washington, D. C.: American Council on Education, January, 1940. 
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8. Remedial teachers may consult them in selecting pupils 
for special remedial work and in planning that work. 

9. Psychologists and psychiatrists may turn to them for 
leads in diagnosing personality maladjustments and planning 
treatment. 


10. Superintendents and principals may make limited use 
of them in appraising the work of the school and introducing 
modifications. This type of use should be carefully thought 
out and cautiously applied. 

11. Counselors and teachers may refer to them as a means 
of stimulating pupils to do their best work. This is a legiti- 
mate use if the comparison is directly with the previous record 
of each pupil and only indirectly, or not at all, with that of 
other pupils.* 


12. Finally, the entire faculty may employ cumulative test 
records in developing what is perhaps the school’s most im- 
portant function—the planning for each pupil of a program 
that is suited to him and the individualization of instruction 
in accordance with such a program.® 


The American Council cumulative record forms, from 
which the card used in the illustrations in this article was 
adapted, are now being revised.® A tentative draft of the 
revised high-school form is ready and it will be tried out soon 
in several public high schools. Changes have been made in 
various parts of the record to accord with modern trends in 
education. It is significant that in the revised form the test 


4The use of test records in pupil self-appraisal is described in Richard D. 
Allen, Self-Measurement Projects in Group Guidayce, Inor Group Guidance 
Series, Volume III. (New York: Inor Publishing Company, Inc., 1934), xviii+ 
274. 

5See Ben D. Wood, “The Need for Comparable Measurements in Indi- 
vidualizing Education,” Educational Record, Supplement No. 12, pp. 5-13. 
Washington, D. C.: American Council on Education, January, 1939. 

6The revision of the American Council cumulative record forms is being 
done by a subcommittee of the Committee on Measurement and Guidance of 
the American Council on Education. The chairman of the subcommittee is 
Eugene R. Smith, Beaver Country Day School, Chestnut Hill, Massachusetts. 


339 





EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


section continues to be one of the most important aspects of 
the record. Any forward-looking cumulative record, regard- 
less of whether it is devised by an organization of national 
scope or by a local school system, will inevitably include a 
thorough test record, for it is becoming generally recognized 
that a prerequisite to an adequate program of guidance is a 
comprehensive, systematic testing program. 
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HE OBJECTIVES of the student counseling program 

at the University of Minnesota have not changed funda- 
mentally since its inception in the Arts College in 1923. The 
aims of the group of Arts College faculty counselors as stated 
in 1928 by Paterson were: ‘First, to bring about a more 
harmonious adjustment of individual students to the oppor- 
tunities available within and withayt,the University, and sec- 
ond, to establish, as far as possible, a friendly and constructive 
relationship between individual members of the faculty and 
students desiring such contact.” (2:265-266) 

Subsequent trends in the University’s curricular organiza- 
tion and the developing facilities for personnel work have led 
to a greater differentiation of function within the total coun- 
seling program.” The increasing complexity and the conse- 
quent professionalization of certain types of counseling re- 
sulted in the establishment of the University Testing Bureau 
among other specialized agencies for the treatment of student 
problems. This University-wide counseling agency is both 
coordinate and coordinated with the counseling agencies of the 
separate colleges within the University. 





1Assistance in the tabulation and summarization of materials was provided 
by Minnesota work projects under project 6714, sub-project 85, sponsored by 
the University of Minnesota. 

2For an historical treatment of these developments see E. G. Williamson 
and T. R. Sabin. (Minneapolis: Burgess Publishing Company, 1940). 115 pp. 
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The Testing Bureau, in its function as a counseling 
agency,® provides professionalized educational and vocational 
guidance supplementary to the services of other personnel 
departments on the campus. Counseling is performed on 
an individualized basis, the counselor using information from 
tests, reports from other personnel workers on the campus 
and from community and high school agencies, and from 
clinical interviews with the student. 

In a series of papers published recently, the authors 
treated the problem of the evaluation of these counseling 
services. The first in the series presented a systematic anal- 
ysis of experimental methods as applied to this type of coun- 
seling (4). Based upon our conclusions with regard to 
method, two experimental evaluations of this counseling were 
reported. 

The first of these experiments (5) investigated the rela- 
tive adjustment of students who did and did not cooperate 
with the counselor. The criteria of adjustment and coopera- 
tion were judgments by workers who had not been involved 
in the counseling process and were based on readings of the 
case history and follow-up interviews. The results showed 
that students who cooperated were more likely to be adjusted. 

The second experiment (6) tested the hypothesis that 
students counseled by the Bureau would be better adjusted 
and more successful academically than students who had not 
been counseled by the Bureau or any college counseling agency. 
This hypothesis was found to hold for the comparison of a 
counseled with a matched non-counseled group of freshman 
Arts College students. The criteria in this study were judg- 
ments of adjustment and cooperation and average grade 
achievement. 

Future progress in counseling of this nature will depend 
upon knowledge of the resources and techniques utilized by 
the counselors, the types of problems dealt with, and the 
effectiveness with which these problems were handled. In 


3The Bureau also functions as a University-wide testing agency and as a 
locus for research in testing and counseling. 
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this paper we are concerned with giving a representative pic- 
ture of the resources and general techniques utilized in the 
University Testing Bureau over the period from 1932 to 
1935. 

Analyses of faculty counseling in the Arts College (3, 8) 
and an exploratory analysis of Testing Bureau counseling 
(4:253-260) form the background for the present study. 
The exploratory study of Bureau counseling was based on a 
sampling of 196 student cases, analyzed as to origin, class, 
and college. The representativeness of these cases in terms of 
high school scholarship and college aptitude score was deter- 
mined. Summaries were presented of the kinds of case data 
used, the agencies consulted or referred to for diagnosis and 
treatment, the types and frequencies of student problems, and 
the general counseling techniques used. 

The present study is designed to amplify the description of 
Bureau counseling from a much broader sampling of the 
total case load. A total of 2053 student cases, the bulk 
of students who came in for complete counseling services over 
the period from 1932 to 1935, formed the population for 
this survey.* The actual case history folders, including rec- 
ords of counseling interviews, were analyzed, and the presence 
of certain items of information tabulated. No questionnaires 
filled out by the students were used for this analysis. This 
study, therefore, provides an answer to the question, ‘‘What 
is counseling ?”’ in terms of the judgments of counselors made 
in terms of particular students and not of students in general. 

Of the 2053 cases, 1223 students were men and 830 
were women. Classified according to year in college, there 
were 617 pre-college students (recent high school graduates), 
721 freshmen, 482 sophomores, 143 juniors, 54 seniors and 
36 graduate students. By college the distributions were: Gen- 
eral College, 428; Arts College, 1038; pre-college who did 
not matriculate in the University, 197; Chemistry-Engineer- 

4By “complete counseling services” is meant testing and extensive inter- 


viewing. The Bureau also provides many types of testing services for members 
of the student personnel staff of the University. 
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ing-Mines, 133; Agriculture, 79; Education, 48; Business, 34; 
Medical-Dental-Pharmacy, 23; Graduate School, 36; Nurs- 
ing, 18; and University College, 5. 

Origin of the Cases 

The efficiency of a counseling program must, in part, be 
measured on the basis of its integration and coordination with 
other personnel functions. This criterion implies that coun- 
selors at various levels of specialization are aware of the 
limits of their functions and are making use of the services’ 
of specialized personnel workers through the medium of 
referral. 

With this in mind the origin of the cases counseled in the 
Bureau becomes pertinent to its efficiency. Obviously, the two 
main categories of origin are referred and voluntary. Over 
fifteen years of experience in the counseling program at Min- 
nesota have shown that the best results can be achieved when 
the student comes voluntarily to the counselor or seeks assist- 
ance at the suggestion, but not command, of some member of 
the University staff or student body. Of the total of 2053, 
1069 of the students were classified as voluntary cases. Actu- 
ally, of the 984 remaining students classified as referred cases, 
in only 93 cases was referral made by University officials 
in the spirit of pressure. These were students of low scholar- 
ship who had been referred to the Bureau as part of pro- 
cedures involved in scholastic discipline. A total of 791 stu- 
dents had been referred to the Bureau for testing and coun- 
seling after interviews with a college counselor or faculty 
member. In addition, 100 students had been referred by high 
school counselors or community welfare agencies. 

The largest proportion of the voluntary cases, 892, came 
to the Bureau after having heard about its services through 
bulletins, class lectures, friends, or relatives. In addition, 122 
students were told about the Bureau’s service in the Regis- 
trar’s office and 55 had learned about it from high school or 
college teachers other than counselors. 

What distinguishes these students coming by way of vari- 
ous campus and community agencies? Analyses of the vari- 
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ance (1:Chap. V) in high school percentiles and college apti- 
‘tude test scores give a partial answer to the question. Re- 
ferred students tend to be lower than voluntary students in 
both high school achievement and college aptitude (F values 
of 24.08 and 76.59, respectively; both well beyond the one 
per cent point). While there are significant variances be- 
tween referred and voluntary cases, “t” tests (1:97) re- 
vealed that the variation of the sub-categories in college apti- 
tude was homogeneous within each of the two main divisions. 
This means that there were no reliable differences in ability, 
either among students who had been referred by a faculty 
counselor, a college official, or a high school counselor or 
welfare officer, or among students who came voluntarily as a 
result of contact with high school or college faculty, the Reg- 
istrar’s office, or some informal source of information about 
the Bureau’s services. 

In the case of high school achievement, differences do exist 
between students referred by high school counselors or wel- 
fare agencies and students referred by college officials, college 
counselors, or faculty members. The relations found between 
type of problems and origin of case tend to clarify the pic- 
ture. Students who have been referred by high school coun- 
selors or welfare agencies are more likely to have financial 
and health problems and are less likely to have vocational 
and educational problems. Thus we may conclude that stu- 
dents referred from these sources outside the University are 
likely to be students with the financial or health problems 
referred because they were good students in high school and 
had well-developed vocational goals. 

The relation of type of problem to origin of case also 
gave indications that the other two types of referred students 
tend to have fewer financial problems and that students who 
came voluntarily were more likely to have vocational problems 
and less likely to have health problems. 

Types and Frequencies of Problems 

Before more adequate evaluations of counseling can be 

made, the types and frequencies of problems encountered must 
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be described. Then only can the foundation be laid for more 
precise experimentation in which treatment methods are dif- 
ferentiated according to their relative value for various types 
of problems or problem constellations. 

The scope of this paper is limited to a general description 
of counseling. Another paper is being prepared which will 
describe how these problems cluster for the students counseled 
and with what characteristics these clusters are associated. 
This is conceived as the first step toward evolving a symptom. 
atology for use in counseling. 


TABLE 1 
TYPES OF STUDENT PROBLEMS AS RECORDED IN CASE HISTORIES 








Frequency 
of 


Occurrence 


A. Financial: 


1. Need or desire for part time work, scholarship or 
loans; inadequate finances 


B. Vocational: 


1. Poor aptitude for chosen vocation 
2. Inability to decide between two or more vocational 
choices 
Definite choice but wants confirmation or encour- 
agement 
. Definite choice but in doubt about aptitude 
. Definite choice but based only on influence of 
family, friends, etc 
. Dearth of interest in any vocation 
. Information needed about occupations in general... 
. Vocational choice without adequate self-analysis. . 
Inadequate information in regard to professional 
choice 


Educational: 


Poor aptitude for college work 

Selection of course in line with occupational choice 632 
Inferiority in academic skills such as reading, study 
habits, English usage, etc 

Understanding grading standards 

High general aptitude and poor scholastic achieve- 
ments 

Understanding responsibilities in college 
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7. High aptitude hampered by standard curricula... . 
8. Outside work interfering with studies 
9. University entrance without proper requirements.. 48 1842 


D. Social, Personal, and Emotional: 


1. Too much social life or too many social activities. . 

2. Inadequate participation in extra-curricular activi- 
ties 

3. Selecting student activities in line with interests. . . 

4. Social personality traits which may hinder profes- 
sional success 

5. Need for encouragement and self-confidence 

6. Social timidity 

7. Emotional disturbances 

8. Family domination in vocational choice 
. Conflict with family or friends 

10. Parental anxiety for a wise vocational choice 

11. Fear of intellectual inadequacy 

12. Idealization of a profession 

13. Over-evaluation of a college degree 


E. Health and Physical Disabilities: 


1. Serious physical disabilities 
Easily fatigued 


tent illness 


2. 
3. Inability to do justice to work because of intermit- 
4. Physical habits, diet and sleep, etc 


Table 1 shows the distribution of the 5876 problems found 
in the case records of these 2053 cases. We see that about 
two-thirds of the problems of the students were of an educa- 
tional or vocational nature. The most frequent vocational 
problems were found among cases of students unable to decide 
between two or more vocational choices or who wanted con- 
firmation or encouragement in making a’ vocational choice. 
The most typical educational problems were those of selecting 
a training program appropriate to the vocational choice and 
those due to inferiority in such academic skills as reading, 
study habits, and English usage. 

Social-personal-emotional problems were the next most 
frequent types of problems. Modal sub-types were less marked 
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here. The three most frequent types of social-personal-emo- 
tional problems were: need for encouragement and self-con- 
fidence, social personality traits which may hinder professional 
success, and emotional disturbances, for the most part of a 
non-psychiatric nature. 

Health problems were infrequently found among these 
cases, 178 problems being discovered and reported to the 
Bureau by the University Student Health Service. 


The Informational Basis of Counseling 


The progress of counseling as a discipline has been char- 
acterized by a departure from “gold brick’ methods of judg- 
ing abilities and character. The trend is toward a greater 
reliance upon the systematic collection of data about the indi- 
vidual by means of standardized tests, reports from other 
people who have had contacts with him under diverse condi- 
tions, medical records, and so on. The only vestige of early 
counseling methods is the interview. This is still the most 
vital part of the counseling process, but is now the melting pot 
in which the student and the counselor integrate information 
to draw out a unified picture of the individual and to plan the 
next steps in adjustment. 

In Table 2 we present a summary of the number and fre- 
quencies with which each source of data was consulted by the 
counselor. If frequency and source of data are grouped to- 
gether, 27,866 units of data were used as the basis for coun- 
seling. The most frequent source of information was voca- 
tional and educational tests given in the Testing Bureau. 
Clearance slips from the Faculty-Student Contact Desk (6:83) 
provided information in 2038 cases. 

Other important sources of data were: Health Service 
reports, University Entrance Test rating, and grades from 
the Registrar’s office. The fact that reports from family or 
other relatives were the least frequent sources of information 
could be taken as an indication of the need for a study to 
determine whether a social worker would be a valuable addi- 
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tion to the Testing Bureau staff. It would be necessary to 

determine how much information would be added and how 

useful that information would be in diagnosis and treatment. 
Counseling Procedures Classified by Type of Problem 

One of the areas for research in counseling which has 
been least exploited is the precise description of the counsel- 
ing interview—the relationships between various counseling 
processes and the effectiveness with which each type of prob. 
lem is handled. The ultimate objective of such description is 
the delineation of these counseling processes in terms of 
fundamental psychological categories. One possible prelimi- 
nary step may be the general description of what the coun- 
selor did with regard to various types of problems. This is 
called a general description because it does not attempt to 
take into account the psychological setting within the inter- 
view (e.g., the specific attitudes the counselor and counselee 
had toward each other at that point) when the behavior 
described occurred. 

In Table 3 we present a general description of the 
Bureau’s counseling procedures. The data show that the com- 
monest procedures in counseling students with financial prob- 
lems took the form of discussing the need for work, discussing 
scholarships and loan funds as a source of money, and dis- 
cussing the relation of part-time work to the student’s class 
schedule. 

In the treatment of vocational problems, the counselors 
relied mainly on discussions of aptitude and on advice and 
recommendations of occupational choice on the basis of test 
results. Other frequent procedures include advising voca- 
tional “tryouts” through college courses, descriptions of occu- 
pations and advising a general background training before a 
definite choice is made. 

With educational problems, the most frequent procedure 
was aid in selecting a schedule of classes in line with aptitude. 
In another large number of cases the counselors discussed 
course prerequisites, sequence of courses, and the like. 
Attempts at cultivation of interests in studies and scholastic 
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TABLE 3 
4 COUNSELING PROCEDURES CLASSIFIED BY TYPE OF PROBLEM 
Frequency 
of 
Types of Problems Occurrence 
§ Financial Problems 
q 1. Discussion of relation of part-time work to class sched- 
i eet ak Aa Ss ba pasa naa AE gh wie ke Kel 71 
3 Ee ee 83 
. 3. Suggestions of ways of getting jobs................ 12 
$ 4. Discussion of student’s expenses and financial resources 54 
f 5. Discussion of scholarship and loan funds............ 
| 6. Letters of recommendation for jobs, scholarships and 
é ae ates ey en sakes x éan's wx ean AT aS EAs 40 
- 7. Referral to emplovment bureau................... 22 
§ 8. Referral to financial aid agencies.................. 31 39] 
‘ Vocational Problems 
5. Dhescrintion of accupations... .... 2.2.65. ccc ceceecss 232 
2. Referral to informational books.................0. 128 
e ie NINN IE MU bs obo 0. a°ds we sear owe ae 1120 
r 4. Discussion of student’s financial resources for occupa- 
Fila ideo ain alas svg Gade cee A 125 
5. Vocational tryouts through college courses........... 351 
€ 6. Advice and recommendation of occupational choice (on 
* Nis 5 Sch ae eae make 1346 
. 7, Advice of general background training before definite 
I NR bios 8S svt Kdbcdun sev Re ya whaeae 251 
g 8. Discussion of method of entering and securing employ- 
5 ment in chosen Occupation. .........0. 5c ccc esecuee 120 3673 
35 Educational Problems 
1. Use of class schedule for program making........... + 
2. Discussion of course prerequisites, sequence of courses, 
s SERRE SSNS BAS Ceara ere MAT ee ON ee | eee 304 
d 3. Cultivation of interests in studies, scholastic record, etc. 121 
st 4. Explanation of recitation method of studv........... 13 
5. Discussion of special surroundings conducive to effec- 
“ NE as ote a a aay as ule nik poe aaa aw wll ex 31 
1- 6. Discussion of methods of vocabulary building........ 27 
2 7. Tutorial aid with specific subiects........... aie eee 20 
8. Aid in selecting a schedule of classes in line with apti- 
NNN rs Sn ee Gre ok en Un Ryne ee eee erk 642 
re 9. Aid in budgeting hours for studv..............-0:- 49 
e. 10. An attempt to analyze cause for difficulty with a 
ad NE 5 Sine ce 6 ada N Ak ORR a Ske 38 
m 11. Explanation of student’s low aptitude as cause of low 
= RR hole iad ace pea 3h pie ke kee eke eas 34 
AC 12. Recommendation of non-college type of vocational train- ss 
ok eich chile acs vb Naw RS eR ee 
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Recommend student change course of study.......... 


. Discussion of eligibility for prescribed work.......... 


Referral to “How to Study” instructors............ 
Social, Personal and Emotional Problems 
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Referral to Health Service for special health examina- 
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records, recommendation of a non-college type of vocational 
training, and recommendation that the student change his 
course of study were other procedures with high frequency. 

The tabulation of the procedures used with social-per- 
sonal-emotional problems indicate that the counselor relied 
on either a general discussion for encouragement and assist- 
ance with the problem of self-confidence or discussed worries 
and other emotional problems. The next most frequent step 
was to refer the student to the psychiatrist for diagnosis. 

Discussion of the physical handicaps involved was by far 
the main method used with health problems. The counselor’s 
discussion did not impinge upon medical advice but rather 
upon the relationship between physical condition and educa- 
tional and vocational adjustments. 


Summary 


A general description based on 2053 cases was presented 
as a basis for analytical description of counseling in the 
Testing Bureau of the University of Minnesota over a four 
year period from 1932 to 1935 inclusive. 

This description enables us to see how well the Bureau's 
counseling service is coordinated with the general personnel 
program of the University. By broad delineations of problem 
areas handled, of sources and amounts of data used and of 
procedures followed, the authors hope to break ground for 
a much needed basic description of the psychological processes 
involved in counseling interviewing conducted in a non- 
psychiatric guidance clinic for college students. 

What is needed is like descriptions by other counseling 
services which may be based on the same or other philos- 
ophies of counseling. Such an accumulation of data should 
lead to even more specific descriptions in which recorded 
interviews would probably provide the raw data. Ultimately, 
it should be possible to determine experimentally which coun- 
seling procedures are most effective with what types of 
problems. This is the objective of evaluative research in the 
field of counseling. 
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A COMPOSITION TEST FOR FOREIGN 
LANGUAGES 


LAWRENCE ANDRUS 
University of Chicago 


HIS PAPER discusses a type of French composition 
» go developed at the University of Chicago, which has 
in practice yielded remarkably good results as a measuring 
instrument and has proved very stable, i.e., has given com- 
parable scores from year to year. 

The test was developed as a part of the comprehensive 
examination in French 104-105-106, the sequence in Interme- 
diate French given in the College of the University of Chi- 
cago. The College, as the term is used at the University of 
Chicago, includes the years corresponding to the freshman 
and sophomore years of the traditional four-year program. 
The prerequisites for admission to French 104-105-106 are 
two units of high school French or the successful comple- 
tion of French 101-102-103, the sequence in Elementary 
French. Students in the College, as contrasted with more 
advanced students who desire to offer French 104-105-106 
as an elective in a field related to their major field, may gain 
credit for the sequence only by passing the comprehensive 
examination given at the end of the Spring Quarter. The 
great majority of students in the course are College students. 
Since these students pass or fail solely on the basis of the 
comprehensive examination, the staff of the course and the 
examiner attempt, in every possible way, to make the exami- 
nation as valid, as reliable, and as discriminating as they can. 
In the attempt to secure greater reliability, objective ques- 
tions, or questions which can be scored with high objectivity, 
have been devised. 

The Announcement of the College for 1940-41 describes 
French 104-105-106 thus: “The primary objective of the sec- 
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ond-year sequence is the standardization of the language abili- 
ties. To that end there is continuous training in formal and 
informal written and oral expression, aural comprehension 
and the accurate determination of the value of the printed 
word. Approximately twenty-five hundred pages are read, 
with reports, following individual programs.” This statement 
is a fair description of the course as given in the preceding four 
years, the period covered by this investigation. The type of 
test here discussed was intended to measure the outcomes of 
training in written expression. It was first used experimentally 
in the comprehensive examination of June, 1934." In substan- 
tially its present form, it was included as a part of the 1935 
examination, and retained in the following years with minor 
changes in the physical presentation. 

The essential features of this type of test are as follows: 
a French passage is chosen which, in the judgment of the staff 
and the examiner, contains material suitable for testing at the 
level of the course, from the point of view of both vocabu- 
lary and syntax. It should be emphasized that the choice of 
an appropriate passage is extremely important, if the test is 
to yield maximal results. It may be necessary to read many 
pages before a suitable passage is located. This passage is 
then translated into good English. The next step is to go 
through the French text and delete certain words and phrases. 
The corresponding parts of the English translation are under- 
lined and numbered to agree with the numbers replacing the 
omitted words and phrases in the French passage. The stu- 
dent is required to complete the French passage in accordance 
with the English translation. He is guided in this task by 
the numbers and the underlining. 

A sample taken from the June, 1939, examination will give 
a better idea of the physical arrangement of the test than 
lengthy explanation: 





1See Ernest Haden and John M. Stalnaker, “A New Type of Comprehen- 
sive Foreign Language Test,” The Modern Language Journal, XIX, 2 (Novem- 
ber, 1934), 81-92. 
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Translation of French Text on Opposite Page 


The old marquis de la Tour-Samuel, (1) eighty-two years old, arose and 
came (2) to lean against the mantelpiece. He said (3) in his (4) somewhat 
trembling voice: 


“(5) I, too, know a strange thing, so strange (6) that it has been the obses- 
sion of my life. (7) It is now fifty-six years (8) since this adventure (9) hap- 
pened to me, and (10) a month doesn’t go by (11) without my seeing it again 
in a dream. (12) There has remained to me from that day a mark, an imprint 
of fear, do you understand me? Yes, (13) I underwent horrible fright, (14) 
for ten minutes, (15) in such a way that since that hour (16) a kind of constant 
terror (17) has remained (18) in my soul. (19) Unexpected noises (20) make 
me start; (21) objects (22) that 1 make out (23) poorly in (24) the evening 
shadow give me (25) a mad desire (26) to run away. Finally, I’m afraid (27) 
at night. 


“Oh! (28) I shouldn’t have admitted (29) that (30) before having arrived 
at my (31) present age. Now I can say (32) everything. It is permitted 
(33) not to be brave before imaginary dangers, when (34) you are eighty-two. 
Before real dangers, (35) I have never retreated, ladies.” 
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Certain words and phrases have been omitted from the following French 


passage, and a number has been substituted for each omitted word or phrase. 
In each numbered space at the right, write in FRENCH the appropriate word 
or phrase. Be sure that your translation fits the French context. An English 
translation of the passage is given on the page opposite; the translation of each 
omitted word or phrase is underlined and preceded by a number which cor- 
responds to the number in the French passage for each in reference. Note that 


there is not always exact 


correspondence in form between the French and the 


English. The old marquis uses the conversational, that is, informal, style, until 


he begins to tell his story, 


Le vieux marquis de la 
Tour-Samuel (1), se leva 
et vint (2) la cheminée. 
Il dit (3) sa voix (4): 


— (5) sais une chose 
étrange, tellement étrange 
(6) a été Vobsession de 
ma vie. (7) maintenant 
cinquante-six ans (8) 
cette aventure (9), et 
(10) (11) en réve. (12) 
de ce jour-la une marque, 
une empreinte de peur, 
me comprenez-vous? Oui, 
(13) Phorrible épouvante. 
(14), (15) que depuis 
cette heure (16) terreur 
constante (17) (18). (19) 
(20); (21) (22) je dis- 
tingue (23) dans (24) me 
donnent (25) (26). J’ai 
peur (27), enfin. 


Oh! (28) (29) (30) a 
mon age (31). Maintenant 
je peux (32) dire. II est 
permis (33) devant les 
dangers imaginaires, 
(34). Devant les dangers 
véritables, (35), Mes- 
dames. 





which is in /iterary style. 
























































(1) 

(2) 

(3) (4) 
(5) 

(6) (7) 
(8) (9) 
(10) (11) 
(12) (13) 
(14) (15) 
(16) (17) 
(18) (19) 
(20) (21) 
(22) (23) 
(24) (25) 
(26) (27) 
(28) (29) 
(30) (31) 
(32) (33) 
(34) 





(35) 
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A casual inspection of the sample suffices to reveal that this 
test form makes possible the use of a great variety of items, 
both as to content and as to length. The items may all be 
classified under one heading: usage, with subclassification 
under active vocabulary (including idioms) and grammar (in- 
cluding syntax). By the proper choice of items tested, it is 
possible to vary at will both the level of difficulty and the pro- 
portion of vocabulary items and grammar items. In this way, 
validity with reference to specific objectives and content of a 
given course of study may be built into the test. For instance, 
in French 104-105-106 at the University of Chicago, a com- 
mon practice has been to restrict items used in this test to the 
2,500 words of highest frequency in the Vander Beke French 
Word Book.* A similar procedure may be followed with 
respect to idioms, by using the Cheydleur French Idiom List.* 
Note that both upper and lower limits may be adopted. At 
present, grammar items must be validated on the basis of text 
books used and the subjective judgment of the instructing 
staff. When the French Syntax Count, begun under the direc- 
tion of the late Professor Coleman, and now proceeding under 
the direction of Professor Keniston, is finally available, there 
will be an objective criterion of difficulty for French syntac- 
tical constructions. In Spanish, this invaluable aid has already 
been published.* 

Theoretically, the most discriminating type of item for 
use in an achievement examination is one answered correctly 
by 50 per cent of the group taking the examination.> In 
practice, we almost never find a test containing even a majority 
of items of this type, except in the case of. standardized tests 
which have been refined by statistical procedures, and even then, 





Opa E. Vander Beke, French Word Book (New York: Macmillan, 


3Frederic D. Cheydleur, French Idiom List, Based on a Running Count of 
1,183,000 Words (New York: Macmillan, 1929). 
‘ — Keniston, Spanish Syntax List (New York: Henry Holt & Co., 
1937). 
5Thelma Gwinn Thurstone, “The Difficulty of a Test and its Diagnostic 
Value,” The Journal of Educational Psychology, XXIII, 5 (May, 1932), 335-343. 
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perhaps only with reference to the group on which the test was 
standardized. The classroom teacher interested in using the 
kind of test herein discussed would obviously have neither the 
time nor the statistical knowledge to go through the various 
steps required to develop a test composed largely of the most 
discriminating items. A fair approximation can, however, be at- 
tained by remembering that items that will be passed by prac- 
tically all the group of students, or by almost none, have very 
little value for discrimination. They might be called “dead 
wood.” The experienced teacher and his colleagues can, by 
subjective judgment, identify many such unprofitable items. 
Repeated use of a given test form and inspection of the re- 
sults (not necessarily involving a formal item analysis, al- 
though that is always desirable when practicable), will tend 
to bring the teacher’s subjective judgment of the worth of an 
item closer to an objective evaluation. It goes without saying 
that the value of an item as regards discrimination varies 
with the level of instruction and the content and method of 
the course, and should always be estimated in terms of these 
latter. To take a hypothetical example, in one school an 
item involving a particular use of the definite article in French 
might be highly discriminating, whereas in a school in a neigh- 
boring town, using a different course of study and a different 
method, the students might have received so much drill on 
this particular point that an item involving it would be passed 
by practically every student, and, hence, be of very little value 
for discrimination. 

If the passage chosen, although otherwise desirable, is 
judged to lack an adequate number of instances of a particular 
construction considered important by the instructing staff, it is 
frequently possible to add items involving this construction 
by making slight changes in the French passage. It may not 
even be necessary to change the English translation at all, 
after such revision has taken place. The vocabulary items 
can be controlled in like manner. 

The scoring of this test can be made very objective. At 
the time the test is constructed, as complete a key as possible 


360 
























COMPOSITION TEST FOR FOREIGN LANGUAGES 


is prepared, preferably by the entire staff of the course. This 
key facilitates the work of the scorer, who must, however, 
know the language thoroughly. The scoring cannot be en- 
trusted to clerks. Whenever the scorer meets a correct 
answer not included in the key, he adds it to the key. Even 
with the necessity for consideration of such answers, the 
scoring is very rapid. In syntactical items, minor errors in 
spelling and mistakes in accents are disregarded, provided 
that the student uses correctly the construction on which the 
item hinges. The scoring thus becomes nearly as objective 
as that of a multiple-choice test. 


TABLE 1 


COMPOSITION TEST — FRENCH 104-105-106 
1937 1938 1939 1940 





MN Gos. aw dG de are 112 100 100 100 
EEE OPEL COT IEE 112 100 100 100 
No. of points in comprehensive 

| Fae See 545 495 485 485 
eS ns cules 52.09 $1.75 45.96 54.52 
Standard deviation ........... 17.80 15.95 14.86 13.98 
EE So ina eas wae *s 94 94 92 91 
Standard error of measurement 

ioe Vi—s) . 2.05.5. 4.36 3.91 4.20 4.19 
Correlation with entire compre- 

hensive examination ........ 92 91 88 90 
iE CARESS 6. 4 sc 4 s0h 0,0 dace ti 60 78 52 


*Estimated by Kuder-Richardson formula No. 20, 
aocaeanle (= — npq 
n— 1 or" 

Table 1 shows the main results of a statistical analysis of 
the different forms of this test used in the comprehensive ex- 
aminations in French 104-105-106 at the University of Chi- 
cago during the four-year period 1937-1940 inclusive. 

We note that the mean score in all four years was in the 
general neighborhood of 50 per cent of the possible number 
of points in the test. This is equivalent to saying that the 
average item was answered correctly by about 50 per cent of 
the group taking the examination. In 1940, a few items were 








*See G. F. Kuder and M. W. Richardson, “The Theory of the Estimation 
of Test Reliability,” Psychometrika, II, 3 (September, 1937), 151-60. 
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purposely included which seemed a priori rather easy for the 
group tested, in order to test the effectiveness of such a priori 
judgment. These items were answered correctly by most of 
the students, and are reflected in the higher mean score for 
the 1940 test. 

In each of the four forms of the test, the standard devia- 
tion was large enough so that the students’ scores were well 
spread out, thereby facilitating classification of the students 
in rank order of merit. The differences in size of the stand- 
ard deviation from one year to another are no greater than 
differences often found in administering the same test to two 
different groups of students, one slightly more homogeneous 
than the other. The spread of students’ scores on this test 
is thus quite comparable from year to year. 

For a test of 100 items, a reliability of .90 is commonly 
considered good. The lowest reliability coefficient estimated 
for the four-year period was .91 for the 1940 form; the 
highest, .94 for the 1938 form (this is relatively better than 
.94 for the 1937 form, since the latter contained twelve more 
items). 

In none of the four years is the standard error of meas- 
urement as large as one-tenth of the mean score, and in none 
is it as large as one-third of the standard deviation. These 
values are satisfactorily low. They indicate that chance 
error in measurement has been kept within reasonable limits. 
Note that the difference between the highest and lowest 
standard errors of measurement here reported is only .45. To 
illustrate the meaning of this slight difference between the 
two extremes, let us assume that a student in 1937 and a 
student in 1938 each have a score of 40.00. The chances 
are two out of three that the true score of the 1937 student 
lies between 35.64 and 44.36, and that the true score of the 
1938 student lies between 36.09 and 43.91.* 


In only one year, 1939, does the correlation of students’ 
scores on the composition test with their scores on the entire 





*Editors’ Note: It will be recognized that not all statisticians would agree 
as to the validity of this interpretation. 
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comprehensive examination fall below .90. The 1939 dis- 
tribution includes one case (that of a student not registered 
for the course) which is so unlike the other cases in the group 
that it lowers the correlation by at least .01. It is customary 
to regard the correlation of a sub-test with the entire exami- 
nation as spuriously high, since the sub-test is being corre- 
lated with a test of which it forms a part. In each of the 
four years, however, the composition test represents only 
about one-fifth of the total number of points in the entire 
comprehensive examination, and yet the correlation of this 
part with the entire comprehensive examination remains uni- 
formly high, although a good part of the remaining material 
in the comprehensive examination has changed in character 
from year to year. This phenomenon, taken in connection 
with the relatively constant mean, standard deviation, relia- 
bility, and standard error of measurement, leads to the con- 
clusion that under the conditions prevailing in French 104- 
105-106 at the University of Chicago this composition test 
represents a particularly stable type of measuring instrument. 


Everyone must agree that the best method of testing 
French composition would be to require the student to write 
a free composition in French, if such tests could be scored 
reliably. Unfortunately, reliable scoring is in practice very 
hard to obtain. In the few instances where moderate success 
has been achieved, the process requires a great deal of time 
and involves essentially using the services of a jury of experts. 
Most teachers would probably agree that, at least at the 
lower and intermediate levels, the two elements which would 
assume the greatest importance in their judgment of a free 
composition in French are active vocabulary (including idi- 
oms) and grammar (including syntax). This type of test is 
capable of measuring students’ achievement in these two ele- 
ments reliably and objectively. The writer feels that at the 
lower and intermediate levels it is wiser to use a test which 
can do this than to run the risk of unreliable measurement 
which use of free composition entails. 


The results of the statistical analysis shown in Table I 
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can be accepted unquestionably only for French 104-105-106 
at the University of Chicago. There is no guarantee, but 
there is a strong presumption, that a test of this kind, con- 
structed elsewhere with equal care, and with due attention to 
the objectives, content and method of the course of. study, 
would yield equally favorable results. The technique could 
certainly be applied to Spanish, Italian, and Portuguese as 
well as to French; the sentence structure of German might 
prevent the technique from being as effective in that language 
as in the Romance languages. 
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PERFORMANCE TESTING IN PUBLIC 
PERSONNEL SELECTION 


PART 11° 


SIDNEY W. KORAN 
Employment Board, Pennsylvania Department of Public Assistance 


The Test for Graphotype-Adddressograph Operators 


HE POSITION of graphotype-addressograph operator 
occurs in each of the four regional financial offices of 
the Department of Public Assistance. Since these offices were 
conveniently located about the State and were equipped with 
a sufficient number of graphotype and addressograph machines 
to insure rapid and efficient conduct of the performance test, 
each of the four was used as an examination center and the 
examinees were permitted to appear at the one of their choice. 
To minimize difficulties likely to arise because some exam- 
inees might be unfamiliar with the particular models on which 
the test was to be given, the notification form sent to each 
examinee included the statement: “The examination to which 
you have been assigned has been designed to test your ability 
to operate the Class 6300 Graphotype and the Class 2700 
Addressograph.”’ 

The test consisted of the following two parts and was set 
up and scored so that much the greater emphasis was placed 
on Part I: 

I. Embossing names and addresses.on Addressograph 
plates with the Class 6300 Graphotype. 

II. Printing cards from the embossed plates with the 
Class 2700 Addressograph. 


Standardization of the procedure was achieved by (1) 
having all tests administered under the supervision of trained 





1Part I of this article appeared in the July issue of this Journal. 
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individuals, (2) requiring the examiners to repeat all direc- 
tions to the examinee verbatim from the copy, (3) provid- 
ing each examinee with a set of instructions setting forth 
the nature of the examination he was about to take and ex- 
actly what was expected of him, (4) using a stop watch for 
all timing, and (5) careful mechanical inspection of every 
machine at the beginning of each period of testing. 

About 10 minutes before he was assigned to a machine 


each candidate was given a copy of the Instructions to Exami- | 


nees (see Exhibit G) and told to read them carefully and 
to refer to them as often as he wished. When his turn ar- 
rived he was assigned to a Graphotype, furnished with a file 
drawer containing 22 plates and 20 frames, and given five 
minutes to familiarize himself with the machine and to prac- 
tice embossing two of the plates. At the expiration of the 
practice period the examiner collected the two practice plates, 
furnished the examinee with the list of names and addresses 
to be embossed, and read aloud the appropriate sections of 
the Instructions to Examiner (see Exhibit H). At the expira- 
tion of 10 minutes the examinee was directed to stop emboss- 
ing and to place the completed plates into frames. He was 
then assigned to an Addressograph and required to print a 
card from each embossed plate. If the examinee was unable 
to operate the Addressograph sufficiently well to print a leg- 
ible copy from each plate, the examiner had an assistant print 
the plates on a strip of paper so that a record of the candi- 
date’s performance on the Graphotype would be available for 
scoring. 

As with the Telephone Operator test (described in Part 
I of this article), the scoring procedure was designed to elimi- 
nate those whose performance fell below the standard estab- 
lished as the minimum acceptable, and to produce quantitative 
ratings reflecting the relative operating ability of those who 
survived that elimination. In establishing the qualifying 
point, consideration was given to (1) the requirements of 
the job, (2) the calibre of the individuals employed and avail- 
able for employment, and (3) data on the agency’s previous 
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experiences with a performance test for this type of position. 

The procedure followed in scoring the plates of those 
who met this criterion took into consideration both the speed 
and the accuracy with which the plates had been embossed. 
The examinee’s raw score on the Graphotype portion of the 
test was determined by subtracting his error score—computed 
by counting each deviation from the key” as one error—from 
the total number of strokes completed. Examinees who dem- 
onstrated ability to operate the Addressograph were given 
additional credit up to 10 per cent of the total allowed for 
the complete performance test. Keys were constructed which 
reduced the scoring task to a routine operation. 


The Test for Tabulating Machine Operators 

The position of tabulating machine operator (IBM equip- 
ment) occurs only in the operating agency’s State office. In 
administering the test in cities other than Harrisburg it was 
therefore necessary to arrange to use the facilities of the 
IBM Service Bureau. 

To discourage individuals whose entire practical experi- 
ence had been confined to the operation of sorters, numeric 
accounting machines, or Powers equipment from reporting 
for the performance test, the following statement, intended as 
a reminder, was included in the notification form sent to all 
examinees: ‘‘As previously indicated, the examination for this 
position has been designed to test your ability to operate both 
the IBM horizontal counting sorter and the IBM alphabetic 
accounting machine.” 

The test consisted of a two-part exercise. In Part I the 
examinee was required to (1) wire the plugboard of the 
alphabetic accounting machine for listing and for totals, (2) 
use the horizontal sorter, and (3) adjust and operate the 
alphabetic accounting machine so that certain alphabetic and 
numeric data from previously punched cards would be listed 
the same way as the data shown on the specimen form pro- 





2Examples of deviations penalized were (1) incorrect letter or number; 
(2) spacing error of any kind, including line space; (3) insertion of a letter 
or number; and (4) omission of a letter or number. 
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vided with the Instructions to Examinees (Exhibit 1). Part 
II of the exercise was an extension of Part I in that it required 
the examinee to list specific data from the tabulating cards 
after (1) wiring the same plugboard to provide for numeric 
control and subtotals, (2) re-sorting the punched cards, and 
(3) making several additional adjustments on the accounting 
machine. 

About 10 minutes before the candidate was required to 


start the actual test, he was provided with a copy of the In-- 


structions to Examinees and told to read them carefully and 
to refer to them as often as necessary throughout the exami- 
nation. Attached to the sheet of Instructions were a sample 
punched card and a specimen sheet showing the form in which 
the data were to be listed by the alphabetic accounting ma- 
chine in Part I of the exercise. A sample card and a portion 
of the specimen form sheet are reproduced as Exhibits J and 
K respectively. 

As soon as the examinee was ready to start the test he was 
provided with a plugboard, an adequate supply of the vari- 
ous sizes of wires needed in making the connections, and, for 
reference purposes, a type-bar layout and a plugboard dia- 
gram. The examinee was then reminded that the time limit 
for the entire test was one hour and forty-five minutes. 

Elapsed time was recorded by means of an electric job 
clock by stamping the starting time and finishing time of 
each operation on the examinee’s job card. Since it was easier 
to secure the use of plugboards than tabulating and account- 
ing machines, the equipment at each center included about 
three times as many plugboards as it did pieces of mechanical 
equipment. As a result of this it was sometimes necessary 
for an examinee to wait a few minutes for his turn at an alpha- 
betic accounting machine. At no time, however, was it found 
that he was required to wait longer than 10 minutes, and 
this “lost time’? was, of course, automatically taken care of 
by the job clock timing method employed. The small delays 
caused by having several times as many examinees wiring plug- 
boards as could be accommodated at the machines were so 
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slight as to cause virtually no inconvenience to the candidates. 
On the other hand, the saving in time and expense which re- 
sulted from following this procedure was considerable. 

When the examinee had finished wiring the plugboard to 
his satisfaction, one of the examiners inspected it to make 
certain that no connections had been made which would be 
likely to damage the machine. The examinee was then given 
a pack of 35 punched tabulating cards and directed to con- 
tinue with Part I of the test. No limit was placed on the 
number of sheets of paper the examinee may have found it 
necessary to use nor on the number of times he was permitted 
to make changes in the plugboard wiring or machine adjust- 
ments. He was told to write his identification number on 
each sheet and to write “final copy”’ on the one he wished to 
submit for scoring. When Part I had been completed, the 
examinee continued immediately with Part IT... 

During the test the examiners made no attempt to rate the 
candidates on such points as the correctness of their particu- 
lar approach, the acceptability of their work habits, nor, as 
already mentioned, the number of times they found it neces- 
sary to change the wiring or readjust the machine. The only 
factors taken into consideration in scoring the test were (1) 
the accuracy with which the assignments had been carried out, 
as shown by the finished products, and (2) the length of time 
consumed by the examinee in completing both parts of the 
exercise. 

Reproduced as Exhibit L is a copy of the rating form 
which has been marked to show the scores of a typical exam- 
inee. The number in the parentheses after. each item on the 
rating form is the maximum score obtainable for that item. 
An examinee completing both parts of the exercise correctly 
within 45 minutes would receive 30 points for Part I, 30 
points for Part II, and 40 points for finishing within the 
minimum time bracket, making a total score of 100. A func- 
tional breakdown of the 60 points assigned to Parts I and 
II shows the following: sorting, 10 points; location of alpha- 
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betic and numeric fields, 30 points; totals and subtotals, 20 
points. 
The Test for Duplicating Machine Operators 

The position of duplicating machine operator occurs in 
the State offices of the operating agency and the merit system 
agency. Persons filling these positions are required to oper- 
ate several models of mimeograph and multilith machines. As 
most of the work involves the continuous use of the Class 
1200 multilith and the Model 100 mimeograph in the per- 
formance of a large variety of duplicating jobs, and persons 
who can satisfactorily operate these models ordinarily have 
very little difficulty operating the older and less complicated 
types of equipment, the performance test was built around 
these particular machines, and the examinees were so informed 
well in advance of the date of the examination. 

A copy of the Instructions to Examinees (Exhibit M) 
was furnished each candidate at least 10 minutes before he 
was required to begin the test. He was told to read these 
Instructions carefully and to keep them with him for refer- 
ence throughout the examination. 

The test for mimeograph operators was administered first. 
Each examinee was provided with a mimeograph stencil into 
which a solid box of typewritten material measuring 27%” by 
434” had been freshly cut, 75 4” by 6” white file cards, and 
75 sheets of letter-size mimeograph paper on which a 3%” 
by 5%” frame had been printed. The examinee was then 
referred to his printed instructions which directed him to (1) 
place the stencil on the cylinder, (2) adjust the machine and 
duplicate 25 cards so that the material which had been typed 
on the stencil was centered on each card, (3) readjust the 
machine and duplicate 25 sheets so that the typed material 
was centered within the preprinted frame, and (4) remove 
the stencil and prepare it to be filed for future use. The 
frame preprinted on the letter-size paper was located in a 
position which required maximum adjustment of the machine’s 
margin guides before the typewritten material on the stencil 
could be made to print within the required borders. 
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When he was ready for the multilith portion of the test, 
the candidate was provided with a photographic multex plate 
containing typewritten material measuring 6” by 8”, 75 sheets 
of letter-size bond paper on which a 6%” by 84” frame 
had been printed, and a supply of platex, keepeze, blankrola, 
repelex, and absorbent cotton. He was then referred to his 
printed instructions which directed him to (1) apply the pla- 
tex, (2) put the plate on the machine, (3) adjust the machine 
and duplicate 25 sheets so that the typewritten material on 
the plate was centered within the preprinted frame, (4) re- 
move the plate and prepare it to be filed for future use, and 
(5) clean the blanket. Two forms of the multilith plate were 
used alternately. These forms differed from each other only 
in the location of the typewritten material on the plate and 
were designed so that, while all candidates were required to 
make exactly the same kind of adjustments to center the mate- 
rial properly, the examiners were relieved of the necessity of 
setting up the machine after each run. 

The examiners observed the candidates from a reasonable 
distance throughout the test in order to complete the rating 
form shown as Exhibit N. Timing was accomplished with a 
stop watch, and considerable use was made of the remarks 
column of the rating sheet to record all occurrences likely to 
contribute toward a fair evaluation of the examinee’s per- 
formance. Whenever necessary, candidates were told how to 
turn on the particular model of equipment on which the test 
was given. Certain other bits of information, such as the use 
of the ink rolls on the multilith and the side margin adjust- 
ments on both the multilith and the mimeograph were also 
given, but careful note was made of the circumstances in each 
case so that the stipulated penalties could later be subtracted. 

The half hour time limit on each machine was mentioned 
in the Instructions to Examinees but was not particularly em- 
phasized. As the candidate started each part of the test the 
examiner said, “You will be allowed up to 30 minutes to 
complete this part of the test. The time you actually con- 
sume will enter into the computation of your score, but you 
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ought not to work so rapidly that the quality of your work 
suffers.” Very few candidates required more than 15 minutes 
to complete the mimeograph test or more than 20 minutes 
for the multilith test, and most of those requiring more than 
this time were examinees who either got off to a bad start or 
were obviously so unfamiliar with the equipment that they 
were simply persevering while hoping for a miracle to occur. 
Persons in the latter group were encouraged to continue as 
long as they did not endanger the equipment. The additional 
time required to do this was cheerfully charged to public 
relations when it was discovered that, instead of rationalizing 
that if they had been given more time they would have suc- 
ceeded, these candidates eventually insisted on withdrawing 
of their own accord and almost invariably thanked the 
examiners for “being so patient with me and giving me 
every break.” 

Scoring was accomplished by determining the number of 
points earned by the examinee for correctly accomplishing 
each of the items listed in the schedule of credits (Exhibit 
QO), and then entering the appropriate amounts in the spaces 
provided on the summary sheet (Exhibit P). As finally 
worked out, the schedule of credits provided a weight of 60 
for the multilith portion of the test, and 40 for the mimeo- 
graph. 

In establishing the number of credits to be allowed for the 
successful accomplishment of each “item” of the test, con- 
sideration was given to the relative difficulty of the particular 
function under consideration. Thus, in scoring the mimeograph 
portion of the test, twice as much credit (8 points) was 
granted when the examinee’s finished product presented evi- 
dence of correct side-margin adjustments as when the vertical 
margins were satisfactory (4 points). For the multilith, on 
the other hand, more credit (10 points) was given for proper 
adjustment of the side margins than for having the vertical 
margins correct (6 points). 
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Exhibit G 
COMMONWEALTH OF PENNSYLVANIA 
EMPLOYMENT BOARD 
OF THE 
DEPARTMENT OF PUBLIC ASSISTANCE 
Harrisburg 
PERFORMANCE TEst — Series 1900 
GRAPHOTYPE AND ADDRESSOGRAPH MACHINE OPERATORS 
July 1940 
INSTRUCTIONS TO EXAMINEES 


Important: Failure to follow instructions may 
result in disqualification from the examination. 

Study these instructions carefully. When you are ready to begin 
the examination, signal the Examiner. He will assign you to a machine 
and furnish you with the material with which you are to work. 
Graphotype Machine 

The examination for this machine consists of embossing a number 
of names and addresses in accordance with the form shown in the 
attached sample. 


As soon as you have been assigned to a machine, the Examiner will 
furnish you with a file drawer containing 22 plates and 20 frames. You 
will be given 5 minutes to familiarize yourself with the machine during 
which time you may use 2, and only 2, of the plates to practice 
embossing. 

At the conclusion of the practice period the Examiner will collect 
the 2 practice plates and give you a mimeographed list of names and 
addresses which you are to begin embossing as soon as he gives the signal 
to “Start.” 

Continue embossing until the Examiner calls “Time.” Do not put 
the plates in the frames as they are embossed; you will be required to 
do that later. 

“Time” will be called at the end of exactly 10 minutes. 

The list of names to be embossed has purposely been made longer 
than even the fastest operators are likely to be able to complete. If the 
Examiner calls “Time” while you are in the midst of embossing a plate, 
you may remove the unfinished plate from the machine, but you must 
not continue to emboss it. 


Inserting Plates in Frames 


As soon as the Examiner tells you to do so, place each embossed 
plate in the lower part of a frame and arrange all frames in the file 
drawer so that they will be ready to run through the Addressograph. 

When you have finished this task, the Examiner will provide you 


373 


EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


with an envelope containing twenty 4” by 6” cards (on which your 
Identification Number has been printed) and assign you to an Addresso- 
graph. 


Addressograph Machine 
The examination for this machine consists of printing from each 
plate that you have embossed. 
The machine will be set to print consecutively, and you will be 
required to make a single impression on each 4” by 6” card in the 
position shown on the attached sample. 


PLACE YOUR APPOINTMENT SLIP, THE INSPECTION 
SHEET, AND THE TWENTY CARDS INTO THE 
LARGE ENVELOPE AND SEAL IT. 
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Exhibit H 
COMMONWEALTH OF PENNSYLVANIA 
EMPLOYMENT BOARD 
OF THE 
DEPARTMENT OF PUBLIC ASSISTANCE 
Harrisburg 


PERFORMANCE TEST 
GRAPHOTYPE AND ADDRESSOGRAPH MACHINE OPERATORS 
Series 1900 
July 1940 


INSTRUCTIONS TO EXAMINER 


. Read the INSTRUCTIONS TO EXAMINEES and become 








entirely familiar with their contents before attempting to administer 
the examination. 


. Examinees will be scheduled at the rate of three or four per hour 
where one set of machines is available, and at the rate of six or 
eight per hour where two sets are available. 


. Do not admit anyone without an Admittance Slip unless he can 
establish his identity as an examinee who has qualified for the 
machine test. 


. Provision should be made for the examinee to be comfortably seated 
away from the scene of the examination while he is awaiting his 
turn to operate the machines. 


. About 10 minutes before the examinee is assigned to a machine, 
take his fingerprint, hand him a copy of the mimeographed IN- 
STRUCTIONS, and tell him to read them carefully. If he asks 
any questions, you may answer them, but it should not be necessary 
to furnish anv information beyond that already appearing in the 


INSTRUCTIONS. 


. When the examinee is ready to begin the examination and a 
Graphotype machine becomes available, assign him to the machine 
and furnish him with a file drawer containing 22 plates and 20 
frames (all in perfect condition). 


. Tell the examinee he may have five minutes to familiarize himself 
with the machine and may practice embossing on two of the plates. 


. At the expiration of five minutes (or before, if the examinee says 
he is ready to begin) collect the two practice plates and hand him 
the list of names to be embossed. Then say: 
“DO NOT BEGIN UNTIL I GIVE THE SIGNAL. 
EMBOSS EACH NAME AND ADDRESS ON A SEP- 
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ARATE PLATE USING THE SAME FORM AS IN 
THE SAMPLE ATTACHED TO YOUR INSTRUC- 
TIONS. COPY THE NAMES AND ADDRESSES 
EXACTLY AS THEY APPEAR, OMITTING ALL 
PUNCTUATION. AT THE END OF 10 MINUTES, 
WHEN I CALL ‘TIME,’ YOU MUST STOP EMBOSS- 
ING.” 

Say to the examinee: “READY, START.” 

Permit the examinee to continue exactly 10 minutes. Then say: 
“TIME. STOP EMBOSSING.” 

If a plate is in the machine, permit him to remove it. Take away 

the list of names and all blank plates. Then say: 
“PLACE EACH EMBOSSED AND PARTIALLY EM- 
BOSSED PLATE INTO THE LOWER PART OF A 
FRAME AND ARRANGE THE FRAMES IN THE 
FILE DRAWER SO THAT THEY CAN BE RUN 
THROUGH THE ADDRESSOGRAPH.” 

When the examinee has placed each plate in a frame, take away 

the unused frames, hand him his envelope containing the cards, and 

assign him to an Addressograph. Then sav: 
“PRINT THE CARDS FROM THE PLATES. MAKE 
ONLY ONE IMPRESSION ON A CARD AND IN 
APPROXIMATELY THE SAME POSITION AS 
SHOWN ON THE SAMPLE ATTACHED TO YOUR 
INSTRUCTIONS. THE ADDRESSOGRAPH HAS 
BEEN SET TO PRINT CONSECUTIVELY. GO 
AHEAD.” 

When the examinee has made an impression from each plate, tell 

him to place the 20 cards (printed and unprinted) into the envelope 

together with his Admittance Slip and Instructions and seal the 

envelope. 


Note: If the examinee is unable to operate the Addressograph sufficiently well 
to print a legible copy from each plate he has embossed, have the plates printed 
on a strip of paper so that a record of his performance on the Graphotype will 
be available for scoring. 


14. 


If at any stage of the machine operation you and the Addresso- 
graph representative are convinced that the examinee does not 
possess sufficient knowledge of the operation of either machine to 
continue with safety to himself and without damage to the equip- 
ment, the test may be halted. If this becomes necessary, a full state- 
ment of the circumstances must be written on the back of the 
Admittance Slip and signed by both you and the representative. 
In any instance in which the plates themselves will be valuable as 
possible exhibits, they should be enclosed in the examinee’s envelope 
before sealing. 
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Exhibit I 
COMMONWEALTH OF PENNSYLVANIA 
EMPLOYMENT BOARD 


OF THE 
DEPARTMENT OF PUBLIC ASSISTANCE 


Harrisburg 
PERFORMANCE Test — Series 2100 
SENIOR TABULATING MACHINE OPERATOR 
December 1940 
INSTRUCTIONS TO EXAMINEES 


Important: Failure to follow instructions may 
result in disqualification from the examination. 


Study these instructions carefully. As soon as a machine becomes 
available, the Examiner will give you further instructions and furnish 
you with all necessary material. 

The performance test for this position will consist of a two-part 
exercise designed to determine your ability to operate the IBM Hori- 
zontal Counting Sorter and IBM Alphabetic Accounting Machine. The 
time limit for the entire test is 1 hour and 45 minutes. 

You will be given 35 tabulating cards (into which various data 
have been punched) and a plugboard for the Alphabetic Accounting 
Machine. Note that the model of the Accounting Machine being used 
has 32 counters and 55 type bars of which numbers 19 to 43 are 
alphabetic. 

The following operations should be carried out in the order indi- 
cated: 


PART I 


1. Wire the plugboard so that the machine will list the following 
information exactly as shown on the attached sample: 
Card Number 
Name (last, first initial, middle initial ) 
Social Security Number 
Total Benefits Paid 
Weekly Benefit Amount 
Reason 
Note: In addition to listing the data, the machine is to be 
wired to show totals at the end of each of the following fields: 
Total Benefits Paid (allow for six digits) 
Weekly Benefit Amount (allow for six digits) 
2. Sort the cards in “Card Number” order. 
3. Write your Identification Number in the space provided on 
Form I and list the data from the cards to conform to the 
sample and the above instructions. 
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PART II 


. Wire the plugboard to control on “Reason.” 

. Sort the cards by “Reason,” disregarding “Card Number.” 

. Set automatic hammerlock control to eliminate the listing of 
names. 

. On another copy of Form I write your Identification Number 
in the space provided and list the data from the cards, single- 
spaced, to show subtotals for each reason (for Total Benefits 
Paid and Weekly Benefit Amount). 


PLACE YOUR ADMITTANCE SLIP, THIS INSTRUCTION 
SHEET, THE PUNCHED CARDS, AND BOTH COPIES 
OF FORM I IN THE MANILA ENVELOPE AND SEAL IT. 
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Exhibit L 
PERFORMANCE TEST—TABULATING MACHINE OPERATORS 
Series 2100 

















B-24356 Allegheny 
Ident. Number Legal County File Number 
Withdrew from Examination (0).........600 000 cece v 
PART I 
Devdiey Be fal Prater (5)... 5. occ c evn cceeeces 5 
Location of Fields (15) 
EI oa 2s sos ch wisn Dawe wa ewes 1 
a ae Bal ins a ores Wea eS 2 
Sociel Security Number (2)... ... 2.2... .ccccewcces 2 
SN nn eae ee 0 
Weekly Benefit Amount (2)................44. vee 2 
SE Eo eee rr errs Ce 2 
Totals 
Mecmiets Paid Column (2)... 0.0.0 c sec ccces 2 
Benefit Amount Column (2)................-. 0 
Accuracy of Totals (10) 
NS 6) ee ne 5 
Benefit Amount Column (5)...............0.2000 5 
PART II 
EES SELES SO CLE REE 5 
Location of Fields (15) 
NG is oa aa eh AE ae 1 
al a ula 6d 4 Sr ny ale oe 2 
Seen eee Grammer (2)... 0.5 cos ccc deecccects 0 
nw ok Wow ddd ba kuewewen 0 
Weakly Beneltt Amount (2)... . 2... ccecsccceses 2 
etree Sh Oca vg pale a ah Ae WR eee ke 2 
Subtotals 
ON ee D&S re er 2 
Benefit Amount Column (2).................4. 2 
Accuracy of Subtotals (10) 
a 5 
eS) ere es 
Time (40) 


45 minutes or less (40) 

46 to 60 minutes (30) __48_min. 

fe OS a, See ee 30 
76 to 90 minutes (10) 

91 to 105 minutes (0) 


TOTAL RAW SCORE | 8 | 
Scored by S.W.K. Checked by P.N.E. 








(Form EB-742) 
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Exhibit M 


COMMONWEALTH OF PENNSYLVANIA 
EMPLOYMENT BOARD 

OF THE 

DEPARTMENT OF PUBLIC ASSISTANCE 
Harrisburg 

PERFORMANCE TEST 
DuPLICATING MACHINE OPERATORS 
Series 2800 
November 1940 


INSTRUCTIONS TO EXAMINEES 


Important: Failure to follow instructions may 
result in disqualification from the examination. 


Study these instructions carefully. As soon as a machine becomes 
available, the Examiner will give you further instructions and furnish 
you with all necessary material. 

Mimeograph Machine. Time limit, 30 minutes. The Examiner 
will furnish you with the following material : 

1 newly cut mimeograph stencil 
75 4” by 6” cards 
75 sheets of pre-printed 814” by 11” mimeograph paper 
(sample attached) 

The examination for this machine will consist of (1) putting the 
stencil on the cylinder, (2) adjusting the machine and duplicating 25 
cards so that the material which has been typed on the stencil is cen- 
tered on each card, (3) readjusting the machine and duplicating 25 
sheets so that the typed material is centered within the pre-printed box, 
and (4) removing the stencil and preparing it to be filed for future 
use. (You may use as many cards and sheets of paper as necessary in 
setting up the machine, but do not waste any. All material will be 
considered in determining your score in the examination. ) 

WHEN YOU HAVE COMPLETED THIS PORTION 
OF THE TEST, PLACE THE STENCIL AND ALL 
USED CARDS AND PAPER INTO THE LARGE 
MANILA ENVELOPE. 

Multilith Machine. Time limit, 30 minutes. The Examiner will 

furnish you with the following material : 
1 photographic multex plate 
75 sheets of pre-printed 814” by 11” bond paper (sample 
attached ) 
Supply of Platex, Keepeze, Blankrola, Repelex, and absorbent 


cotton 
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The examination for this machine will consist of (1) applying 
Platex, (2) putting the plate on the machine, (3) adjusting the machine 
and duplicating 25 sheets so that the typed material is centered within 
the pre-printed box, (4) removing the plate and preparing it to be filed 
for future use, and (5) cleaning the blanket. (You may use as many 
cards and sheets of paper as necessary in setting up the machine, but do 
not waste any. All material will be considered in determining your score 
in the examination. ) 


WHEN YOU HAVE COMPLETED THIS PORTION 
OF THE TEST, PLACE YOUR ADMITTANCE SLIP, 
THIS INSTRUCTION SHEET, AND ALL USED 
PAPER INTO THE LARGE MANILA ENVELOPE 
AND SEAL IT. 
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Exhibit N 


COMMONWEALTH OF PENNSYLVANIA 
EMPLOYMENT BOARD 


OF THE 


DEPARTMENT OF PUBLIC ASSISTANCE 


Harrisburg 


PERFORMANCE Test FOR DUPLICATING MACHINE OPERATORS 


Series 2800 


November 1940 


EXAMINER’S RATING SHEET 


Mimeograph Machine 





Examinee’s Id. No. 











OPERATION 


Card | Paper 


Remarks 





Place stencil on cylinder 


X 





Adjust paper feed 





Make side margin adjustment 





Make vertical margin adjustment 








Use of print recorder 





Run copies 





Take off stencil 





Clean stencil 


al ta 

















TIME START: 


STOP: 


ELAPSED: 





Multilith Machine 





OPERATION 


Paper 


Remarks 





Platex plate 








Put on plate 





Ink up 





Puil proof 








Locate form in proper position 





Clean blanket 





Set counter 





Run copies 





Clean plate 





Take off plate 





Keepeze 





Clean blanket 














TIME START: 


STOP: 


ELAPSED: 











Date 





Examiner. 





Examiner. 
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Exhibit O 
PROCEDURE FOR SCORING DUPLICATING MACHINE 
OPERATOR PERFORMANCE TEST 
SERIES 2800 
SCHEDULE OF CREDITS 


Note: The examinee’s rating for each item is to be 
determined in accordance with the following schedule 
and entered in the appropriate space on the scoring 
sheet (Form EB-760). All ratings and totals must 
be checked by a second scorer. 


Mimeograph (40) 
Time (16) 
1 to 10 minutes — 16 points 
11 to 15 minutes — 12 points 
16 to 20 minutes — 8 points 
21 to 25 minutes — 4 points 
26 to 30 minutes — no credit 
Stencil (2) 
Removed and cleaned properly (credit if checked on rating sheet) 
—2 points 
4” by 6” cards (11) 
Practice cards (not more than 15) — 2 points 
Number of copies (25 to 30) — 2 points 
Use of counter — 1 point 
Vertical margin adjustment (80% of final copies with at least 34” 
margin top and bottom) — 2 points 
Side margins (80% of final copies with at least 34” margin each 
side) — 4 points 
814” by 11” paper (11) 
Practice sheets (not more than 15) — 2 points 
Number of sheets (25 to 30) — 2 points 
Use of counter — 1 point 
Vertical margins (80% of final copies not touching horizontal 
lines) — 2 points 
Side margins (80% of final copies not touching vertical lines) — 4 
points 
Penalty (-4) 
No. 1—If examinee was given information on side margin adjust- 
ment—subtract 4 points (but only when examinee has received 
credit for correct side margin adjustment). 


Multilith (60) 
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Time (24) 
1 to 10 minutes — 24 points 
11 to 15 minutes — 18 points 
16 to 20 minutes — 12 points 
21 to 25 minutes — 6 points 
26 to 30 minutes — no credit 
Clean plate (credit if checked on rating sheet) — 5 points 
Clean blanket (credit only if second listing is checked on rating sheet) 
— 2 points 
814” by 11” paper (29) 
Practice sheets (not more than 15) — 5 points 
Number of copies (25 to 28) — 3 points 
Use of counter — 2 points 
Side margins (at least 3/16” on each side) — 6 points 
Vertical margins (at least 3/16” top and bottom) — 10 points 
Inking (3) 
Evenness — 2 points 
Blackness — 1 point 

Penalties (-10) 

No. 2—If examinee was given information on use of ink rolls — 

) subtract 5 points. 

No. 3—If examinee was given information on side margin adjust- 
ment — subtract 3 points (but only when examinee has re- 
ceived credit for correct side margin adjustment). 

No. 4—If examinee left an ink roll in contact with plate cylinder— 

subtract 2 points. 












EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 

































Exhibit P 


PERFORMANCE TEST—DUPLICATING MACHINE 
OPERATORS—SERIES 2800 








File Number County Ident. Number 














Operation of Mimeograph (40) 
Withdrew or was stopped by Examiners (F)........ ——— 
OND NID Se econ eithe on henG aia oN a aioe oa w oeweawnas Cee aoe 
MEO oon es Sis wks SUR inde s Sah ames 


4” by 6” Cards (11) 
i dc de bb oe oka OR oe 
SN IN ad ss wncinte eu sad ans on 9), eeeneeeell 
Vertical margin adjustment (2).............. 
EE Oe nee eee ee ihisenbiaiiiaaal 
814” by 11” Paper (11) 4 
SS > Ee a en ae a 
I I ED cick doko s kekaee siemabamenlll 
Vertical margin adjustment (2).............. 
ok 5 6 Anh in KAW KN AO 


Operation of Multilith (60) 
Withdrew or was stopped by Examiners (F)........ | iabsitiiassiaan 
I eae atta uk Wek d ou Le ow we lew ae 
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OBTAINED IN SECONDARY SCHOOL 


-a- L. D. HARTSON 
and 


- | A. J. SPROW 
— &§ Oberlin College 


THE VALUE OF INTELLIGENCE QUOTIENTS 





> FOR PREDICTING COLLEGE SCHOLARSHIP 


N SELECTIVE admission to college, and particularly in 
the award of scholarships, it is the practice to request a 


report of scores made by the candidates in intelligence tests. 


This study reports (1) the relative value of these different 
tests for predicting (a) total high school scholarship,’ (b) 
college freshman scholarship, and (c) seven-semester college 


scholarship; (2) the comparative validity of these tests and 
the Ohio State University Psychological Examination; (3) the 


the different I.Q. levels. 


average 1.Q. of the student body, as determined by these 
various instruments; (4) comparison of the I.Q.’s of the 
Oberlin group with those of the Terman-Merrill standardiza- 
tion group; (5) average freshman scholarship for students of 


A total of 835 freshmen entered the College of Arts and 


% Sciences, Oberlin College, during the period, 1934 to 1940, 


for whom I.Q.’s were available, which could be identified with 
specific forms of tests, in groups large-enough to warrant 
statistical treatment. In six cases there were two scores, making 
the total of 841 in Table 1. Of these, 253 had progressed as 
far as their eighth semester. For these, the computations are 


| based upon the scholarship record for seven semesters (those 











different grading schemes. 
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1The figures used in the computations of high school scholarship represent, 
not the actual grades, but “credit points” obtained by a system used to equate 
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VALUE OF I.Q.’§ FOR PREDICTING COLLEGE SCHOLARSHIP 


used for the award of Phi Beta Kappa). In order to make 
the scholastic records equivalent for the several classes of 
varying size, scholarship is handled in terms of proportional 
class rank. Because the reports did not, in all cases, specify 
the particular variety of Otis test employed, all the Otis scores 
have been grouped. All the I.Q.’s here considered were de- 
rived from group tests. 
Results 

Table 1 reports the coefficients of correlations (1) between 
the I.Q.’s on the Otis, Terman, Henmon-Nelson, National, 
and Kuhlmann-Anderson Tests, and scores on the Ohio State 
University Psychological Examination administered before 
matriculation in college as one variable, and high school 
scholarship; (2) between the above-named tests and first 
semester college scholarship; (3) between I.Q.’s on one or 
another of the first five tests and scores on the OSU test, 
administered during freshman week, with the means and 
sigmas. In the case of the National and of the Kuhlmann- 
Anderson tests the N is rather small, and all the data, there- 
fore, are less reliable than those obtained with the other tests. 
To obtain a basis for comparing the validities of the OSU test 
and each of the others, Table 2 reports, for each of the test 


TABLE 2 


CORRELATIONS BETWEEN THE OSU TEST SCORES AND SCHOLARSHIP OF 

THE GROUPS TESTED WITH THE OTIS, TERMAN, HENMON-NELSON, 

NATIONAL, KUHLMANN-ANDERSON AND OSU TEST ( PRE-ENTRANCE 
GROUP) WITH MEANS AND SIGMAS 











Scholarship 

Test Group N High Sch. Freshman Means Sigmas 
_ 2a aoa 444 394 579.- 49.35 28.95 
CO ae 221 337 550 51.47 28.00 
Henmon-Nelson. 110 458 .604 48.59 29.22 
POO a. 473 631 54.18 25.36 
Kuhlmann- 

Anderson .... 28 .633 564 51.93 26.63 
OSU Test 

(pre-entrance) 258 510 .629 48.72 28.72 
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groups, the correlation between scores in the OSU test and 
(a) high school scholarship and (b) first semester college 
scholarship, with means and sigmas. 

1. Relative Validities of the I.Q. Tests. A comparison 
of the validities of the different tests yielding I.Q.’s indicates 
that the Henmon-Nelson test gets first place. However, the 
relative deviate of the difference (k) between the correlation 
of Henmon-Nelson I.Q.’s and college scholarship (.480) and . 
the corresponding coefficient for the Otis test (.364) is 
but 1.20. 

2. The Prediction of High School and College Scholar- 
ship. The coefficients indicating the relationship between I.Q. 
and high school scholarship range between .212 and .396. 
With the exception of the Kuhlmann-Anderson test (for 
which there are but 28 cases) the I.Q.’s constitute a better 
basis for predicting college scholarship than they do high 
school grades. When validated against college scholarship, 
the coefficients range between .287 (omitting Kuhlmann- 
Anderson) and .480. Byrns and Henmon, who used I.Q.’s 
obtained from National Intelligence Tests, administered in the 
4th to 8th grades, also found a closer correlation between 
I.Q. and first semester college scholarship (.454) than be- 
tween I.Q. and total high school scholarship (.426) (2). 
Higher validities for the college criterion were also obtained 
for both of the OSU examinations. With the pre-entrance 
OSU Test the coefficients are .365 and .474, and with the 
Freshman Week test, they are .510 and .629, for high school 
grades and college scholarship, respectively. These results 
substantiate previous findings at Oberlin. For the 511 men 
and 609 women who entered as freshmen during the period, 
1931 to 1934, the correlation between college scholarship and 
OSU Test intelligence is represented by coefficients of .605 and 
.574, for the men and the women, respectively, whereas the 
correlation between test intelligence and high school scholar- 
ship is represented by coefficients of .398 and .380 (3). It 
will be noted that the OSU Test scores have higher validity 
than does the I.Q., as indicated by the coefficients obtained 
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TABLE 3 


COMPARATIVE VALIDITIES OF THE OSU TEST AND THE I. Q. TESTS FOR 
PREDICTING HIGH SCHOOL AND COLLEGE SCHOLARSHIP 











Test Group High School Scholarship College Scholarship 
OSU Test 1.Q. OSU Test 1.Q. 
ioe eh eeh ened a 394 Ky | 579 364 
ee re 337 281 550 403 
Henmon-Nelson...... 458 396 .604 480 
ee 473 212 631 .287 
Kuh!mann-Anderson .. .633 247 564 178 





when correlations were computed for each of the 1.Q. popula- 
tions between the scores made with the OSU Test and the two 
scholarship criteria (Table 3). 

The OSU Test is designedly a more difficult one than the 
other tests. Although some tests have as many items as the 
OSU Test, none requires as much time. All of the I.Q. tests 
are time limited, maximum time being 30 minutes, whereas 
the OSU Test was administered by work-limit method, stu- 
dents usually taking at least two hours. 


3. Comparative Validity of Pre-entrance and Freshman 
Week OSU Test Scores. That the higher coefficients obtained 
for the OSU Test may not be due entirely to its greater 
difficulty, however, is suggested by a comparison of the co- 
efficients obtained for the OSU Test under two sets of condi- 
tions. There were 258 students who took the OSU Test some 
time before entering college who were re-examined with this 
test during their Freshman Week. (In some instances the 
same form of the test was used, but usually it was another 
form.) The Freshman Week test yielded substantially higher 
validity figures with both criteria: .510 as compared with .365 
for high school scholarship, and .629 as compared with .474 
for college freshman scholarship. These coefficients, with the 
means and sigmas, are reported in Tables 1 and 2. It will 
be noted that the group given the OSU (pre-entrance) Test 
had distinctly lower scholastic records than the others and 
that they also displayed greater variability. This is to be 
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explained by the fact that the great bulk of these students 
were given the test because their high school record made their 
admission questionable. On the other hand, the group con- 
tained some who, being of exceptionally high caliber, were 
applying for scholarships. In most cases, it may be presumed, 
the OSU Test was administered under conditions of greater 
motivation than prevailed with the I.Q. tests. 

4. Intercorrelations Between the Test Scores. The inter- | 
correlations between the scores made on the OSU Test and the 
different forms of I.Q. range from .456 to .610; the sequential 
order from the higher to lower coefficients being: Terman, 
Henmon-Nelson, Otis, National, and Kuhlmann-Anderson. 


5. Validation of Tests Against Total College Scholarship. 
There are 253 students for whom scholastic grades are avail- 
able for seven semesters of the college course. Because the 
numbers were too small to warrant separate computations for 
each of the tests, all of the I.Q.’s were combined and the 
validity coefficients computed, using both the one- and the 
seven-semester criterion. Validity coefficients were obtained 
for the OSU Test for the same population. These are given 
in Table 4. The two validity figures for the I.Q.’s are .341 
and .319, and for the OSU Test scores the figures are .501 
and .438. Although scores on the OSU Test are more valid 
bases for prognosing college grades than is the I.Q. at both 
levels, their superiority for predicting total college scholarship 
is less than when used for predicting freshman grades. From 
Table 4, one may also note that, as in the computations re- 


TABLE 4 
CORRELATIONS BETWEEN I. Q.’S AND OSU TEST SCORES AND (1) HIGH 
SCHOOL SCHOLARSHIP, AND SCHOLARSHIP FOR (2) ONE AND (3) SEVEN 
SEMESTERS ; WITH MEANS AND SIGMAS 

















Test Score N HighSch. 1Sem. 7Sems. Mean Sigma 
Be keys & eae 253 171 341 319 121.24 10.00 
iy oa 253 307 501 438 50.95 29.85 
oa oh aie aw inal 75.82 53.37 47.16 


Tere Pree tes 17.19 25.68 29.00 
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ported for the other populations, the 1.Q. (and OSU Test 
score) shows a closer relationship for college freshman 
scholarship than for high school scholarship. 

6. Average 1.Q. of the Oberlin Student Body. The mean 
I.Q. of the 835 freshmen, as measured by these group tests, 
is 121.06. There is substantial agreement on this point be- 
tween the Otis, Terman, and Henmon-Nelson tests (see 
Table 1). The cases measured by the National and Kuhlmann- 
Anderson tests, for which the means are 127.21 and 124.96 
respectively, are too few to influence the general average 
materially. The average for the group of 253 who attained 
senior status is 121.24, thus indicating that practically no 
selection occurred between the freshman and the senior year 
in terms of I.Q. This is corroborated by the OSU Test stand- 
ing of the freshman and senior groups. In terms of local 
freshman norms, the mean score of the 835 students is 49.97, 
thus indicating that they are an almost completely perfect 
sample of the Oberlin first-year population. The mean score 
of the 253 who became seniors is 50.95. The seniors do, how- 
ever, constitute a somewhat selected group in terms of college 
scholarship. This is indicated by the fact that, whereas the 
mean freshman scholarship of the entire group is represented 
by a proportional rank of 49.55, the mean freshman rating of 
. those who persisted until they reached senior status is 53.37— 
the larger figure represents higher scholarship status—and 
the mean scholarship of the 588 who had not become seniors 
is 47.91. The critical ratio of the difference between the fresh- 
man scholarship of those who became seniors and those who 
did not is 2.73. 

7. Comparison of Oberlin Students with the Terman- 
Merrill Standardization Group. Figure | presents a graphic 
comparison of the Oberlin students with the normal group of 
2904 used in the standardization of the Terman-Merrill Binet 
test (4, p. 37). The numbers and proportions of the Oberlin 
population at the different I.Q. levels are reported in Table 5. 
The I.Q.’s of the Oberlin group range from 92 to 169, 99 per 


393 













EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 





TABLE 5 
FRESHMAN AND SENIOR SCHOLARSHIP OF STUDENTS OF 







































































DIFFERENT I. Q. LEVELS 
Entrants Seniors 
Scholarship N pos- N ac- Scholarship 
I.Q. N Yo Mean’ Range sible tual %  #Mean_ Range 
166-170 3 0.4 97.7 97-99 3 2 66.7 96.0 93-99 
161-165 0 0 
156-160 0 0 
151-155 1 0.1 94.0 94 1 1 100.0 $7.0 87 
146-150 13 1.6 72.9 13-95 5 3 60.0 63.3 30-90 
141-145 14 a7 72.1 26-97 7 5 71.4 66.0 12-84 
136-140 Hh 5.3 70.9 5-98 16 13 $1.3 $1.2 1-99 
131-135 57 6.8 59.8 4-99 18 14 77.8 49.6 13-99 
126-130 110 13.2 58.4 1-99 43 36 $3.7 57.4 1-99 
121-125 178 21.3 52.1 1-99 74 57 77.0 53.3 5-100 
116-120° 179 21.4 45.5 1-98 71 52 73.2 42.4 2-91 
111-115 119 14.2 41.3 1-97 48 32 66.7 38.5 2-95 
106-110 75 9.0 33.3 1-94 39 26 66.7 31.1 5-65 
101-105 29 35 36.7 2-98 18 8 44.4 37.3 4-69 
96-100 8 0.9 25.1 3-42 2 2 100.0 27.0 26-28 
} 91-95 5 0.6 25.8 7-69 2 2 100.0 24.5 14-35 
835 347 386253 72.9 
24 . 
22 [ — TERMAN— MERRILL ; 
GROUP —— j 
or N= 2904 Pus 
: OBERLIN 
16 F GROUP ~~~ 
N=835 
=! 
<2) 
Oo 
we 12 F 
i] 
ms 10 + 
8 s 
re 
4L 
r 4 — 
1 ' i iT -< en ee ail 
35 45 55 65 75 85 95 105 115 125 135 145 155 165 
44 54 64 74 84 94 104 114 124 134 144 154 164 174 


Figure 1 


Distributions of the I.Q.’s in the Terman-Merrill Standardization Group 


and the Oberlin Group. 
394 












wle 


Vewwewes wee ea SS OW OP GG I 
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cent exceeding the Terman-Merrill mean. The Oberlin sample 
is rather sharply peaked, showing little kurtosis, but it displays 
a slight positive skewness. Variability is much less than that 
of the Terman-Merrill sample, sigma being 10.4 as compared 
with 16.4 for the larger group. 

8. Mean Scholarship of Students of Different I.Q. Levels. 
Table 5 reports the numbers and proportions of students of 
the different levels of I1.Q. with their (1) mean freshman 
scholarship rank, (2) mean senior scholarship rank, (3) the 
range of scholarship achievement for those at each level, and 
(4) the proportion of those in college long enough to have at- 
tained senior status who did so, for each IJ.Q. level. Exami- 
nation of the table reveals the following salient facts: 

(a) As indicated by the correlation coefficients previously 
noted, the general tendency is for those of higher I.Q. to 
make the better scholastic records. 

However, (b) the range of scholastic performance is, 
with few exceptions, remarkably similar at each test level. 
Freshman achievement in the highest and the lowest deciles 
is recorded for students with I.Q.’s ranging all the way from 
105 to 140, although no student with an I.Q. below 111 
achieved a top tenth ranking for the entire college course. 
There was one student with an I.Q. of 105 who achieved a 
proportional rank of 98 in freshman scholarship and has been 
in the upper tenth of her class in each of the two subsequent 
years. Her centile score, according to state norms, on the 
OSU Test is, however, 71, so the later test is evidently a more 
accurate index of her intellectual ability. 

(c) The four students with I.Q.’s above 150 all made 
exceptionally good records. 

(d) Sufficient time has elapsed to permit but four stu- 
dents whose I.Q.’s are below 101 to become seniors. They 
have all obtained the A.B. degree, but in only one instance was 
this achieved in the normal four-year period. By persistent 
effort, however, they did finish the course, and all of them 
ranked above the lowest decile of their class. This is com- 
parable with Adams’ finding at the University of Texas. 
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(e) The retentive power of the college was not mate- 
rially greater for those at the higher than for those at the 
lower extremes of the distribution. 

(f) Some degree of selectivity is indicated, however, by 
a comparison of those with I.Q.’s above 126 with those whose 
I.Q.’s are below 116. Of the 93 in the group with the higher 
1.Q.’s, 74, or 79.57 per cent, persisted, whereas but 70, or 


64.22 per cent, of the 109 with the lower I.Q.’s persisted to 


the senior year. As the ratio of the difference in these pro- 
portions to the standard error of the difference is 2.47, it is 
fairly significant. 

Summary 

1. I.Q.’s were available for 835 entering freshmen and 
for 253 of these who had reached the senior year, the scores 
having been derived from the following group tests: Otis, 
Terman, Henmon-Nelson, National and Kuhlmann-Ander- 
son. Scores on the OSU Psychological Examination were also 
obtained. 

2. The difference in the power of the different I.Q. tests 
to predict college scholarship was not statistically significant. 

3. The I.Q.’s constitute a better basis for predicting col- 
lege grades than they do for prognosing total high school 
scholarship. This is also true of the OSU Test scores. 

4. The OSU Test is more successful than any one of the 
other tests in predicting scholarship in high school as well as 
in college. 

5. The OSU Test taken during Freshman Week corre- 
lates more closely with both secondary and college scholar- 
ship than does the same test taken during the senior year 
in high school. 

6. The OSU Test predicts freshman scholarship better 
than it does total college scholarship. 

7. The average I.Q. of the freshmen is 121. The aver- 
age obtained by the Otis, Terman, and Henmon-Nelson tests 
is virtually the same. The averages for the small number 
tested with the National and Kuhlmann-Anderson tests are 
127 and 125, respectively. 
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8. The average I.Q. for the seniors is also 121. As the 
mean OSU Test score for the seniors is but one percentile 
point higher than that for the freshmen, it is evident that 
virtually no selection occurs during the college course, so far 
as test intelligence is concerned. To be sure, there is some 
selection in terms of scholastic record, there being a supe- 
riority of 5.46 points in the freshman scholastic rating of 
those who persisted over those who did not become seniors. 
The critical ratio of this difference is 2.73. 

9. The I.Q.’s of the Oberlin freshmen range from 92 to 
169. 99 per cent of the 1.Q.’s are over 100. Variability 
is represented by a sigma of 10.4, as compared with 16.4 
for the Terman-Merrill standardization group. 

10. Although the correlation between I.Q. and college 
scholarship is .40, the range of scholastic performance is re- 
markably similar at the different test levels between 101 
and 140. 

11. Four students with I.Q.’s between 91 and 100 became 
seniors, but their records were not brilliant. 

_ 12. Although the retentivity of the college was not mate- 
rially greater for those at the extremely high end of the 
distribution than for those at the lower end, 80 per cent of 
those with I.Q.’s above 126, as compared with 64 per cent of 
those with I.Q.’s below 116, who had been in college long 
enough, became seniors. 


Conclusions 

Two facts of general significance emerge from the compu- 
tations: First, the figures indicate that, although it is to be 
expected that students with higher intelligence test scores will 
make the better college records, it is nevertheless possible for 
the average of the group with I.Q.’s as low as 101-105 to do 
acceptable work at Oberlin. There are indeed exceptional 
students who, in spite of the handicap of an intelligence quo- 
tient as low as 92, obtain the A.B. degree. Second, test scores 
show a consistently closer correlation with college scholarship 
than with high school records. Interpretation of these facts 
would seem to point to the significance of adequate motiva- 
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tion. Possessed of determination, drive, and directionality, 
the student whose intellectual ability barely equals the average 
of the general population can “make the grade”, if equipped 
with a good secondary school preparation. Selection of the 
student body at Oberlin is made primarily on the basis of the 
high school record. Students with low I.Q.’s, who rank in 
the lower half of their high school class, are not admitted. 

The higher validity figures obtained when college scholarship — 

is used as the criterion also emphasize the factor of motiva- 

tion. Oberlin students, at any rate, apparently work more 
nearly up to their potential capacity, so far as this is measured 
by the intelligence tests, while in college than in secondary 
school. 
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THE THURSTONE PRIMARY MENTAL ABILITIES 
TESTS AND COLLEGE MARKS 


MARY LOU ELLISON 
and 
HAROLD A. EDGERTON 
Ohio State University 
HE PRESENT STUDY of Thurstone’s Primary Men- 
tal Abilities Tests has been made in order to implement 
the assumption that the scores of the several factors might 
be useful in academic counseling. Four questions form the 
basis for the investigation. 


1. What relationships are there between the factor scores 
and academic grades? 


2. What relationships are there between the Ohio State 
University Psychological Test score and the factor scores? 


3. How well can academic grades be predicted on the 
basis of the primary factor scores? 


4. Are the factor scores related to grades in specific col- 
lege subjects? 


Thurstone’s development of his Primary Mental Abilities 
Tests was for the purpose of appraising seven primary fac- 
tors of mind.! His isolation of these factors and the devel- 
opment of the final test battery is described in the monograph 
“Primary Mental Abilities.”* Thurstone briefly describes 
the factors on his individual record sheet for the tests as 
follows: 

“Factor P. The tests that call for this ability require the quick 
perception of detail in either visual or verbal material. This seems 





1L, L. Thurstone, Manual of Instructions for Administering Tests for Pri- 
mary Mental A bilities, p. 2. 

2L. L. Thurstone, “Primary Mental Abilities,” Psychometric Monographs. 
Chicago: The University of Chicago Press, 1 (1938). 
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to be a perceptual ability which enables some people to excel in 
finding detail which is significant to them or detail which they are 
seeking. It is probably one of the factors that is involved in what 
has been called ‘quick intelligence.’ Scanning a page to find quickly 
some small but significant detail and classifying familiar objects 
quickly are examples of this factor. 

“Factor N. This is one of the clearest factors that has been 
isolated. It consists of facility with simple numerical work and is 
best represented in the tests of rapid calculation. It is of secondary - 
importance in arithmetical reasoning and in deciphering numerical 
code, tasks which call for factors in addition to facility with num- 
bers as such. It is not yet known whether this factor can be exem- 
plified in non-numerical tasks. 

“Factor V. This is a verbal factor which is manifested in tests 
that involve the interpretation of language. It is not restricted to 
mere fluency with words. It reflects an ability to deal readily and 
quickly with verbal material. Those who excel in this factor are 
probably verbally-minded in their thinking and problem-solving. 

“Factor §. This is an ability that is present in those tests which 
require the subject to think visually of geometrical forms and of 
objects in space. While none of these factors can be described in 
detail yet, it seems reasonable to expect that those who have a high 
rating on ability S should be able to do well in those studies and in 
those occupations that require visualizing or thinking about things 
in visual form. Many people think about a problem visually even 
when the nature of the problem does not immediately suggest any 
necessary visual character. 

“Factor M. The nature of this factor was identified by the 
fact that all of the tests which require it are tests of memorizing. 
The appearance of such a factor seems to give justification for the 
belief that a good memory is an ability independent of other mental 
powers. It is not yet known, however, whether the ability to 
memorize is the same as the ability to recall experiences which we 
do not intend to retain for future recall. The present factor M can 
be tentatively named the ability to memorize. 

“Factor I. The tests which require this factor demand that 
the subject discover some rule or principle in the material of the 
test. The factor does not seem to be restricted to material which is 
primarily numerical, primarily visual, or primarily verbal, types 
which were all represented in the tests for this factor. The ability 
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to discover a rule or principle in the solution of a problem is usually 
called induction. People differ markedly in the kind of resource- 
fulness that is involved in inductive thinking, and the hypothesis that 
the factor I is associated with this kind of ability seems plausible. 

It is not known whether this factor is associated with inventiveness 

and initiative. 

“Factor D. The deductive factor is still only tentatively iden- 
tified. It is a factor which is present in syllogistic reasoning and 
also in some other tests. It is one of several factors that may be 
involved in restrictive thinking. In a general description, the factor 
seems to represent facility in formal reasoning.” 

In the present study, Thurstone’s Primary Mental Abili- 
ties Tests, Experimental Edition were used. 

The subjects consisted of a group of 49 students in the 
College of Arts and Sciences, Ohio State University. Most 
of those who took the test were students in the Exploratory 
Program of the College of Arts and Sciences. 

The students tested do not constitute a random sample 
of students of the Exploratory Program, nor of the College 
of Arts and Sciences, nor of freshmen generally. This fact 
must be taken into consideration in the interpretation of the 
results of the study. No one was required to take the test. 
Of the forty-nine subjects, forty-one were freshmen, six were 
sophomores, and two were juniors. In the group, 39 per cent 
ranked in the 90th percentile or above in intelligence (Ohio 
State University Psychological Test), and 54 per cent were 
included in the 80th percentile or above. The mean Point 
Hour Ratio* was 2.40. 

In addition to the scores for the seven factors, and the 
separate scores on the sixteen individual tests from which 
the factor scores are derived, other data from the college rec- 
ords were used. Intelligence test percentiles were based on 
scores received in the Ohio State University Psychological 
Examination, given to all students at the time of entrance 





8The Point Hour Ratio is the total points divided by the hours attempted. 
For each hour of grade A, four points are given; for each hour of B, three 
points; C, two points; D, one point; and E (failure), zero points. 
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to the University. The Point Hour Ratio for each student 
was obtained. Grades received in English, sciences, foreign 
languages, and psychology were recorded, since it was thought 
that each of these groups might be related differentially to 
the factor scores. In the group of forty-nine students, English 
grades were available for twenty-seven, science grades for 
thirty, foreign language grades for twenty-seven, and psy. 
chology grades for twenty-five. 

1. What relationships are there between the factor scores 
and Point Hour Ratio? 

In Table 1, the correlations between Point Hour Ratio 
and the various factors are shown. The correlation between 
Factor V and Point Hour Ratio is the highest (0.44). Fac. 
tor M ranks second in its correlation with P. H. R., the corre- 
lation being 0.31. The other five factors have correlations 
with P. H. R. ranging from —0.24 to 0.19. One might specu- 
late on the meaning of the negative correlations, but on the 
basis of such a sample it might be unfortunate. 

It is likely that in a really random sample of University 
students or of University freshmen such correlations would 
be zero or positive. 

The multiple correlation between P. H. R. and the 
weighted scores of the seven factors is 0.640. When the 
Ohio State University Intelligence Test score is included as a 
variable with the seven factors, the multiple correlation is 
0.648. Such a correlation suggests that there may be some 
justification for the use of the Primary Mental Abilities Tests 
for the prediction of academic success in college. 

2. What relationships are there between the Ohio State 
University Psychological Examination scores and the factor 
scores? 

As in the case with Point Hour Ratio, Factor V shows 
the highest correlation with intelligence (0.52). This is 
perhaps due to the fact that the expression of intelligence is 
largely verbal in character in present tests. The Same- 
Opposite Test, a component of Factor V, shows the highest 
correlation of the several sub-tests with intelligence test 
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TABLE 1 






















































ent 
gn COMPARISON OF CORRELATIONS OF FACTORS AND INDIVIDUAL TESTS 
re WITH INTELLIGENCE TEST SCORES AND POINT HOUR RATIO 
zn (N = 49) 
to : 
ish Intelligence Test Point Hour Ratio 
. Composite Individual Composite Individual 
Or Ff Factor Test Factor Test 
2 errr ere eee 0.06 -0.24 
Identical Forms ......... -—0.05 -0.21 
rai Verbal Enumeration ..... 0.16 —0.27 
8 Se err ene —0.02 0.17 

4 PUGHION: «6.665520 66s ees —0.07 —0.07 
tio Multiplication .......... 0.01 0.29 
en ff Soe a cin y a ashidwae or 0.52 0.44 

POUT Te 0.41 0.33 
Mt Same-Opposite .......... 0.55 0.37 
re-  E ne sche dines wate es -0.11 -0.21 
ns } SE ee rer eee -0.12 —0.40 
u ff re fe eee -0.07 -0.01 
on SETS ee ae 0.28 0.31 
he ease 0.34 0.32 

Word-Number .......... 0.09 0.17 
ty (EE |S SE ORES ee Ea Pa 0.11 —0.13 
Id ‘ Letter Grouping ........ 0.24 0.18 
=o SE eet pnhasivgss 0.04 0.32 

q Number Patterns ....... 0.07 -0.23 
2 Se ere 0.10 0.19 
ei PO errr 0.09 0.10 

: Number Series ......... 0.36 0.35 
he i Mechanical Movement ... —0.13 0.04 
is § 
ne scores. A somewhat similar test is found in the Ohio State 
ts University Psychological Test. 

q The correlation of Factor M with intelligence is 0.28. 
lef The Initials Test correlated 0.32 with intelligence, while the 
r & other component of Factor M, the Word-Number Test, cor- 

related very low (0.09). 
7S The correlation of factors P, I, and D with intelligence 
is are positive, but are very low. Among the components of 
is Factor D, the Arithmetic Test has a low correlation with 
o. intelligence (0.09), the Number Series Test has one of the 
" highest correlations in the battery with intelligence, and the 
t Mechanical Movements Test shows a negative correlation 
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INTERCORRELATIONS BETWEEN POINT HOUR RATIO, PSYCHOLOGICAL 
TEST, AND THE SEVEN FACTORS 




















(N = 49) 
(Intelligence ) 
0.S.U. 
Psych. 
P.H.R. Test P N Vv S M I D 

ge ee 0.20 -0.24 0.17 0.44 -0.21 0.31 0.13 0.19 
O. S. U. 
Psych. Test 0.20 0.06 -0.02 0.52 -0.11 0.28 0.11 0.10 - 
cara we 0.24 0.06 0.34 0.27 0.54 0.14 0.47 0.10 
Ae 0.17 -0.02 0.34 0.43 0.18 0.28 0.38 0.42 
WU sseben 0.44 0.52 0.27 0.43 0.12 0.33 0.36 0.34 
aye -0.21 -0.11 0.54 0.18 0.12 0.08 0.34 0.19 
ae 0.31 0.28 0.14 0.28 0.33 0.08 0.13 0.14 
Di etree etre -0.13 0.11 0.47 0.38 0.36 0.34 0.13 0.17 
eri 0.19 0.10 0.10 0.42 0.34 0.19 0.14 0.17 
Mean .. 2.40 71.6 139.9 107.1 76.9 108.7 15.1 30.7 7.2 
Standard : 
Deviation 0.59 20.3 25.1 36.4 23.8 36.4 7.6 7.8 2.4 





with intelligence. Such correlations might raise 


a question 


regarding the functional unity of the factors. 
3. How well can Point Hour Ratio be predicted on the 
basis of the primary factor scores? 
It would be desirable to be able to predict the probable 
P. H. R. of a student from the scores made on the seven 
factors. The chart below shows that in both situations, the 
highest beta weight is that for Factor V. 


TABLE 3 
BETA AND b REGRESSION COEFFICIENTS 
For the Scores When the OSU Psychological Test is Included 
and When It Is Omitted From the Test Battery 








OSU Intelligence Included OSU Intelligence Omitted 








Beta b Beta b 
Coefficient Coefficient Coefficient Coefficient 

SS -.279 —.007 -.291 —.007 
OSS ae ea .034 .001 .090 .001 
OS eran 568 .014 487 012 
ae as -.113 —.002 —.089 -.001 
ee 216 017 191 015 
RSs -.196 -.015 -.201 -.015 
8 .046 045 .040 .039 
OSU Psychological 

Examination ......... —.004 
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The correlation with P. H. R. is increased slightly when 
the intelligence test rating is used, the correlation between 
P. H. R. and the variables being raised from 0.640 to 0.648. 
In a random sample of freshmen this difference would prob- 
ably be greater. 

4. Are the factor scores related to grades in specific 
college subjects? 

The correlations of course grades with Point Hour Ratio, 
intelligence test scores, and the seven factors are found in 
Table 4. The grades taken into consideration in this study 
are those in English, science, foreign languages, and psychol- 


TABLE 4 


THE CORRELATION OF SUBJECT MATTER GRADES WITH POINT HOUR 
RATIO, INTELLIGENCE, AND THE SEVEN FACTORS 























Foreign 

English Science Language Psychology 

Grade Grade Grade Grade 
“ah lg es 0.72 0.85 0.77 0.58 
DS i ga a 0.42 0.42 0.54 0.02 
Le Se 0.10 -0.12 0.27 0.10 
EE i wie hiew carne 0.34 0.03 0.45 0.37 
| ae 0.75 0.68 0.44 0.59 
ee 0.44 0.23 0.56 -—0.08 
Rater NE Ss. oa > «os 0.42 0.18 0.45 0.23 
ition aver wale 0.24 0.05 0.78 0.06 
St See 0.44 0.23 0.43 0.63 
Number of Cases...... 27 30 27 25 





ogy. The results must be taken as suggestive and not as 
facts from which broad generalizations may be drawn. 

In all four cases, there are high correlations between P. 
H. R. and grades. This is to be expected, since these grades 
are components of the Point Hour Ratio. 

English grades correlate highest with Factor V (0.75). 
Factors S, M, and D also show correlations above 0.40 with 
English grades. 
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The only factor showing a correlation above 0.40 with 
science grades is factor V. 

All the factors are apparently important in determining 
foreign language grades, since all factors except Factor P 
correlate above 0.40 with foreign language. The most 
significant correlation with foreign language is Factor I 
(0.78). This correlation is higher than one would expect, 
but it may be due to the fact that an inductive method is used 


_at Ohio State University in teaching the beginning language ~ 


courses. 

The highest correlation among the factors with psychol- 
ogy grades is with Factor D (0.63). Factor V is also high 
(0.59). These were the only two factors correlating above 
0.40 with psychology. 

Factor V correlates above 0.40 with grades in each of 
the four subject fields considered, the highest being with 
English grades. The correlations between Factor I and the 
school subjects are low with the exception of foreign lan- 
guage grade (0.78). Factor P shows very low correlations 
with all four school grades. There is little differentiation 
between the correlations of the school grades and Factor N, 
the only correlation higher than 0.40 being with foreign 
language grade. Factors S and N both have correlations over 
0.40 with English and foreign language grades, and Factor D 
has a significant correlation with English, foreign language, 
and psychology grades. 

Such observations as reported here suggest that, with 
more experience, the Thurstone Primary Abilities Test will 
become a useful instrument in the academic counseling program 
of colleges. It will be necessary to secure more data in regard 
to the relationships of test scores and course grades from a 
random sample of freshmen. Also, it will be important to 
have some knowledge of methods of instruction in the several 
courses so as to judge whether the relationship observed is a 
function of the abilities of the student and the subject matter 
being studied, or of the methods of instruction. 
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A SHORT CUT IN THE ESTIMATION OF 
SPLIT-HALVES COEFFICIENTS 


CHARLES I. MOSIER 
Social Security Board 

OR SEVERAL YEARS the writer has availed himself 

of a short cut in the computation of reliability co- 
efficients by the split-halves technique. The method has prob- 
ably been developed independently by a number of other 
investigators, but it has not, to the writer’s knowledge, 
appeared in print in connection with this specific problem, 
and there may be some workers to whom it may prove useful. 

With the development of the Kuder-Richardson method 
for the determination of reliability, the split-halves technique 
should probably disappear from the scene. However, as a 
number of investigators have found, it provides a fairly close 
approximation to the Kuder-Richardson value, and since it 
does not require an item-analysis it will probably continue in 
use. In any event, the purpose of this note is not the justifica- 
tion of the technique, but the presentation of a short cut. If 
split-halves coefficients are to be computed, they may as well 
be computed efficiently. 

In brief, the short cut involves the use of the complete 
dependence of the “‘even”’ scores on the “total” and the ‘‘odd”’ 
scores. We may suppose that “total’’ scores have already 
been obtained in connection with the original purpose of the 
test. Because of this algebraic dependency, then, it remains 
only to rescore the papers for the ‘‘odd” scores in order to 
know the even scores, since, 


E,=— T,= 0, (1) 


Furthermore, equation (1) need not be applied to each case 
separately. Not only may we dispense with the necessity of 
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rescoring the papers to get the “even’’ score for each indi- 
vidual; we may go farther, and dispense with the necessity 
of obtaining the individual “‘even” scores at all. 

By defining the value desired, namely, the correlation be- 
tween O and E, and substituting the value of E expressed as 
a function of O and T from equation 1, we obtain the result 
that 





Tor Or —— So (2) 
i 
on Vv Or + t boon Slee Oo Or 





This expression calls for only the ‘‘odd”’ and “‘total’”’ scores, 
from which are obtained their respective standard deviations 
and the correlation between them. The value obtained from 
the odd-even correlation can then be used in the Spearman- 
Brown formula to give the estimated reliability. This value 
is identical with that which would be obtained if each test 
paper were independently scored for the “even” score, and 
the odd-even correlation coefficient computed from the re- 
sulting data. (This can be seen from the derivation. ) 

The expression in equation (2) is readily recognized as 
a special case of the more general formula for a correlation 
between a part and the whole exclusive of the part. O, E, 
and T in equation (2) are by no means limited to “odd,” 
“even” and “total” scores, but apply to any set of variables 
for which equation (1) is true. 














MEASUREMENT ABSTRACTS* 


Adams, C. R. ‘““A New Measure of Personality.”” Journal of 

Applied Psychology, XXV (1941), 141-151. 

A new instrument for measuring personality traits, The 
Personal Audit, is described. It was intended to be relatively 
free from highly personal items since it was felt that such a 
test would be more useful in non-clinical situations. The Per- 
sonal Audit is believed, on the basis of low intercorrelations 
between sub-tests, to measure 9 relatively independent person- 
ality traits. Coefficients of reliability (corrected split-half) 
range from +.90 to +.96. Items have been validated by the 
criterion of internal consistency using a modified version of 
the Sletto technique. W. 4. Varvel. 





Baxter, B. and Paterson, D. G. “A New Ratio for Clinical 
Counselors.” Journal of Consulting Psychology, V 
(1941), 123-126. 

The magnitude of the S.E.,, which clinical counselors 
employ in interpreting test scores, varies in significance with 
the variability ($.D.) of the norm group. It is useful, there- 
fore, to relate them in a ratio as an aid in interpreting 
scores. The following formula, in which r is the reliability 
coefficient, provides a simple way of expressing the magni- 
tude of S.E., as a percentage of §.D.: 


. se v TF 
_ SD. 


Application of this ratio to a list of 49 tests shows that 


= Vl—r 








wae ranges from as low as .10 to as high as .55. In gen- 





eral, achievement tests show the lowest ratio (highest accur- 
acy) with an average of .20, followed in order by scholastic 
aptitude tests (averaging .30), reading tests (.32), special 
" *Edited by Forrest A. Kingsbury. 
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aptitude tests (.33), and personality tests (.27 to .55, aver. 
age .40). F. A. Kingsbury. 





Bennett, G. K. and Raskow, S. ‘‘Extension of the Norms of 
the Columbia Vocabulary Test.” Journal of Applied 
Psychology, XXV (1941), 48-51. 

Constructed and standardized for grades 3-8, this test 
showed a mean score of 54 and standard deviation of 15 for 
the latter half of the eighth grade. Extension of norms seems 
justified when 1212 superior recent high school graduates 
obtained a mean score of 74 with standard deviation of 12. 
When the test was administered to 5101 high school students, 
mean scores increased and the standard deviation decreased 
from grade 9A through grade 12B both among commercial 
and general-course students. Decreasing standard deviation 
probably indicates increasing homogeneity of vocabulary in 
later school years. With grade constant, mean scores decrease 
with age. J. E. P. Libby. 





Blum, Milton L. and Candee, Beatrice. ‘The Selection of 
Department Store Packers and Wrappers with the Aid of 
Certain Psychological Tests: Study II.” Journal of Ap- 
plied Psychology, XXV (1941), 291-299. 

This study is a check on conflicting results in previous 
attempts to determine the value of finger dexterity tests in 
predicting successful wrappers or packers. Tests used were 
the O’Connor Finger Dexterity, Zeigler Placing, Otis Self- 
Administering, and Minnesota Clerical. Test performance was 
checked against production records and foreman’s ratings. 
Results indicate no relation between finger dexterity and pro- 
duction for either packers or wrappers. In the experienced 
group the Minnesota Clerical shows positive correlation for 
both groups. It is concluded that clerical speed and accuracy 
have a much higher relation to production than has finger 
dexterity. D. A. Peterson. 


410 





op wa =H A Oo" HK OT lUFA 


wn 
= 


Ss p 


Se ee ee a 





MEASUREMENT ABSTRACTS 


Brown, A. W. and Blakey, R. ‘A Preliminary Report on the 
Development and Standardization of a Non-Verbal Test 
at the High-School Level.” Journal of Educational Psy- 
chology, XXXII (1941), 113-123. 

A series of 11 non-verbal subtests constructed on the con- 
cepts of primary mental abilities has been standardized on a 
group of 286 suburban high school students. Eight of these 
subtests, two of perceptual speed, two of spatial relations, and 
four of abstract reasoning, constitute the final test, which may 
be given in forty minutes. The ‘“‘Non-Verbal Reasoning Test” 
correlates with school grades .47 and with Otis I.Q. .59; Otis 
1.Q. correlates with school grades .60. Higher correlation 
with grades was not expected since the latter involve other 
abilities in addition to those in the non-verbal test. Reliability 
of the test is .97. Tentative norms are given, including derived 
scores intended to take the place of I.Q.’s at this level; stand- 
ardization on a much larger sample is being undertaken. J. E. 
P. Libby. 





Brown, A. W. and Cotton, C. B. ‘‘A Study of the Intelligence 
of Italian and Polish School Children from Deteriorated 
and Non-Deteriorated Areas of Chicago as Measured by 
the Chicago Non-Verbal Examination.” Child Develop- 
ment, XII (1941), 21-30. 

1262 Italian and Polish school children in a deteriorated 
and a non-deteriorated area were tested with a non-language 
group test battery. The children were in the fourth grade or 
above and from 10 to 14 years of age. The authors stress the 
influence of socio-economic, cultural, and educational factors 
upon test scores. They found (1) a regular decrease in mean 
test performance from age 10 to age 14 for both sexes and 
both nationality groups but not so great as that previously re- 
ported for verbal tests; (2) sexual differences favoring the 
boys, particularly in the case of Italian children; and (3) 
contradictory indications relating to socio-economic community 
level (no significant differences between areas for Italian 
boys; significant differences favoring the deteriorated area for 
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Italian girls; a tendency for Polish children in the non- 
deteriorated area to make better scores). W. A. Varvel. 





Brush, Edward N. ‘Mechanical Ability as a Factor in Engi- 
neering Aptitude.” Journal of Applied Psychology, XXV 
(1941), 300-312. 

This study was intended to explore the possibilities of 
available tests of mechanical ability and aptitude as indicators 
of aptitude for engineering. The report is prefaced with a 
survey of the relevant literature. The subjects were two groups 
of students in the College of Technology at the University of 
Maine, one group of 104 members, the other group of about 
130 members. The criterion was scholastic rank in courses of 
an engineering nature. The tests used were: Minnesota Paper 
Form Board, Minnesota Assembly Test, Minnesota Spatial 
Relations Test, O’Connor Worksample No. 1, O’Connor 
Worksample No. 5, O’Connor Worksample No. 72, Cox 
Mechanical Explanation and Completion Test, Cox Mechani- 
cal Models Test, and MacQuarrie Test for Mechanical 
Ability. In addition data on intelligence tests, algebra, chemis- 
try, plane geometry, and physics tests were also available. 

The conclusions reached are summarized as follows: ‘The 
tests of useful predictive power were the Cox Tests of Mechan- 
ical Aptitude and Minnesota Paper Form Board... . Bat- 
teries of mechanical ability tests yield correlations with the 
criterion of about .40; batteries in which an intelligence test is 
combined with one or two tests of mechanical ability yield 
correlations of about .50.... several batteries of mechanical 
ability tests predict engineering scholarship at least as well as 
the intelligence tests, while the achievement tests, singly and in 
combination, predict success in engineering studies somewhat 
better than do the tests of mechanical ability . . . . total 
engineering record is more highly correlated with first 
semester and first year grades than with any test or combina- 
tion of tests.”” J. E. Karlin. 
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Burtt, H. E. and others. ‘‘Market Problems and Market Re- 
search.” Journal of Consulting Psychology, V (1941), 
No. 4, 145-193. 

This entire number is devoted to eight papers on market 
research, not separately abstracted here because of space 
limitations. The authors and titles are as follows: “Current 
Trends in Marketing Research” (H. E. Burtt); ‘Proving 
Ground on Public Opinion” (H. G. Weaver) ; “Problems of 
Sampling in Market Research” (Frank Stanton); “Charac- 
teristics of the Question as Determinants of Dependability” 
(J. G. Jenkins) ; “Evaluating the Effectiveness of Advertising 
by Direct Interviews” (P. F. Lazarsfeld) ; “Effects of Re- 
peated Interviewing on the Respondent’s Answers” (F. D. 
Ruch); “The Museum Technique Applied to Market Re- 
search” (G. K. Bennett); and “The Role of Psychological 
Interpretation in Market Research” (A. W. Kornhauser). 
F. A. Kingsbury. 





Casanova, T. ‘Analysis of the Effect upon the Reliability 
Coefficient of Changes in Variables Involved in the Estima- 
tion of Test Reliability.” Journal of Experimental Educa- 
tion, IX (1941), 219-228. 

The following topics are discussed and various formulae 
developed in detail: (1) the variance of the halves in the 
split-half method of estimating reliability; (2) the correction 
for guessing with specific reference to the reliability of rights 
and wrongs, the variance of rights and wrongs, the correlation 
of rights with wrongs, the variance of the number of items 
attempted, and the number of possible choices; (3) the effect 
of calling all negative scores zero; (4) the variance of the 
items. In the latter case, a formula for~estimating the re- 
liability of a test in terms of the item variances is presented 
which is felt to be more convenient than the Kuder-Richardson 
formulae. W.R. Varvel. 





Cattell, Raymond B., Feingold, S. Norman, and Sarason, Sey- 
mour B. “A Culture-Free Intelligence Test: II. Evaluation 
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of Cultural Influence on Test Performance.” Journal of 

Educational Psychology, XXXII (1941), 81-100. 

A culture-free intelligence test described in an earlier paper 
was administered together with the Binet (Terman-Merrill), 
A.C.E. (arithmetical sections), and the Arthur Performance 
Tests, to four comparable groups; these were given special 
training, one group in each class of information or skill de- 
manded by the tests. Retest analyses showed the Arthur least. 
influenced by training in its own culture medium, the Culture- 
Free Test next, Binet next, and A.C.E. most influenced. The 
above tests and the Ferguson formboards were administered 
to a group of adult immigrants, resident in this country about 
one year, and a control native group. The Ferguson was very 
close to the Arthur, others followed in the order noted earlier, 
when the groups were retested after 77 days during which the 
immigrants gained noticeably in Americanization. Reliability 
of the Culture-Free Test compares favorably with those of the 
others. Adequate validity is indicated by the Culture-Free 
Test’s high loading in the general factor brought out by 
tetrads, and by its high mean correlation with the pool of 
tests. Since life experience probably brings factors in the Cul- 
ture-Free Test to saturation in widely different cultures, its 
proper application appears broader than that of preceding 
tests. J. E. P. Libby. 





Driver, Randolph S. “The Validity and Reliability of Rat- 

ings.” Personnel, XVII (1941), 185-191. 

Rating is of value in industry only when its limitations as 
a scientific instrument are fully appreciated. The various cur- 
rent methods of obtaining measures of validity and reliability 
are discussed and their values and limitations considered. In 
order for a rating to be acceptable, it must be proven valid 
and reliable. Although difficult to accomplish, ratings are not 
useless, but great caution must be observed in their interpreta- 
tion. Virginia Brown. 
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Dudycha, George J. “A Suggestion for Interviewing for De- 
pendability Based on Student Behavior.” The Journal of 
Applied Psychology, XXV (1941), 227-231. 

College students were divided into groups of extreme 
earliness and lateness, of dependability and undependability, 
on the basis of observation of their behavior in life situations. 
Ten questions on punctuality and persistence and six on de- 
pendability, when presented to these contrasting groups, 
elicited responses indicating significant group differences. Since 
these questions appear to be diagnostic in student behavior, it 
is suggested that they be tested for usefulness in employment 
situations for discovering those applicants likely to prove 
dependable. Virginia Brown. 


Dulsky, S. G. “Vocational Counseling. I. By Use of Tests; 
II. By Interview.” Personnel Journal, XX (1941), 16-28. 
The author briefly and critically examines various types of 

standardized tests available to the vocational counselor. He 
concludes that aptitude tests are of no value and personality 
tests of very limited value. Interest inventories, if used prop- 
erly, may be helpful. Tests of intelligence and educational 
achievement are approved as being of the most value. He 
advocates greater emphasis on the vocational interview as a 
means of diagnosing personality and motivation and of identi- 
fying and evaluating interests. Self-guidance from the study 
of test scores and profiles is impossible. Vocational counseling 
is an individual process, requiring “skilled psychologists” 
rather than ‘“‘mental testers.” The vocational counselor should 
confine himself to descriptive rather than quantitative reports 
of test and interview results and should only rarely go beyond 
general recommendations to his clients. KH. 4. Varvel. 


Ebert, Elizabeth H. “A Comparison of the Original and 
Revised Stanford-Binet Scales.” The Journal of Psychol- 
ogy, XI (1941), 47-61. 

1434 records of 315 children five to ten years of age were 
studied for information as to the comparability of 1.Q.’s from 
the original and revised Stanford-Binet Scales. An increasing 
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discrepancy between I.Q. values on the two scales was found 
at ages 7, 8, and 9. The new revision tends to give lower 
I.Q.’s for levels below 100 and higher I.Q.’s for levels above 
100. The duller children gain slightly more in I.Q. than the 
brighter ones although both groups show increases. With the 
old revision, the duller individuals gain but the brighter ones 
lose. Although the average I.Q. of the 1916 revision was more 


constant, individuals maintained their relative positions better - 


in the new revision. Virginia Brown. 





Eysenck, H. J. ‘“Type-Factors in Aesthetic Judgments.” 

British Journal of Psychology, XXXI (1941), 262-270. 

It has been found previously that the analysis of the inter- 
correlations between the rankings of pictures by a number of 
subjects yields mainly one general factor with no other sig- 
nificant factor. On this occasion the attempt is made to bring 
out the influence of any such secondary factor, even, if need 
be, at the expense of the “T” or general factor. Five series 
of pictures, each consisting of thirty to fifty items, were judged 
in order of goodness by fifteen subjects. The subjects were 
artists, university students, bank clerks, typists, and teachers, 
eight women and seven men, with age range from 20 to 70. 
The table of correlations for each of the five series was fac- 
tored and two significant factors extracted in all cases except 
one. One factor was the ‘““T”’ factor previously identified; the 
other factor, called the “K” factor, seemed to divide the 
population into two different ‘“‘types,”’ one preferring the mod- 
ern, and the other the older style of painting. This factor, 
identified provisionally with “brightness,” correlated with 
extroversion, radicalism, youth, and possibly with preference 
for color. The color-form test also appeared to be correlated 
with extroversion. Results are definite enough to suggest that 
further research into the relation between temperament and 
aesthetic preferences will not only extend knowledge of the 
“type” factors in aesthetic judgments, but also increase under- 
standing of temperamental “types.” J. E. Karlin. 
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Ferguson, L. W. “A Study of the Likert Technique of Atti- 
tude Scale Construction.” Journal of Social Psychology, 
XIII (1941), 51-57. 

The suggestion is here examined that Likert’s method of 
constructing and scoring attitude scales gives results as valid 
as those of the method outlined by Thurstone and Chave with 
much less labor. Items constructed by the former method 
(Minnesota Scale for the Survey of Opinions) were rescaled 
by the latter; standard deviations of the distribution of scale 
values indicate that such items are adequately scaled by this 
method. The scale values obtained indicate that Likert’s 
technique does not obviate the need of a judging group. With 
one exception, the scales cannot be scored by the Thurstone 
method. Scores obtained by the two methods for the excep- 
tional scale show a correlation of .70, confirming the conclu- 


sion. J. E. P. Libby. 


Greene, E. B. Measurements of Human Behavior. New York: 

The Odyssey Press. pp.777. 1941. 

This volume of 24 chapters is divided into three parts: 
Part I, ““Basic Considerations” (discussing introductory con- 
cepts, varieties of appraisals, score-interpretation, measures of 
relationship, types of instruments, item construction and 
evaluation, factor analysis); Part II, “Instruments and Re- 
sults” (tests of early childhood, of achievement, Binet-type 
and group intelligence scales, performance, mechanical and 
motor tests, measures of fine arts—design, literature, and 
music—tests of interests, attitudes, adjustment); Part III, 
“Persistent Problems” (effects of practice on scores, measures 
of growth and senescence, absolute scaling, evaluation of judg- 
ments, native differences). A 30-page bibliography, a com- 
bined glossary and subject-index, 121 tables, and 108 figures 
are features of the book. F. 4. Kingsbury. 








Guilford, J. P. ““A Note on Dubois’s Method of Deriving an 
Achievement Ratio for Students.” Journal of Educational 
Psychology, XXXII (1941), 220-222. 

Dubois’s achievement ratio is that of the student’s actual 
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average mark to that mark corresponding to the standard score 
obtained on a psychological test; these ratios in general are 
low for students with high test scores and high for those with 
low scores. This finding follows from the assumption of a 
correlation of 1.00 between test scores and marks, while 
Dubois gives the correlation as .442. A student may be ex. 
pected to deviate from the mean mark by only .442 as much 
as his standard score indicates. It is suggested that Dubois’s 
conclusion might be reversed if the regression line r equals 
.442 be taken as base. Computation of a special but meaning. 
ful case confirms this suggestion. J. E. P. Libby. 





Hay, Edward N. “Tests in Industry.” Personnel Journal, 

XX (1941), 3-15. 

This is a discussion of the opportunities for psychologists 
in industry. The use of intelligence tests is coming more and 
more to act as a check on employer’s judgment, which is cus- 
tomarily biased in favor of the qualities of aggressiveness and 
good personality. Such tests indicate the level at which the 
employee is able to work most efficiently and his potentialities 
for further promotion. It is particularly important to obtain 
psychological information about an employee at the time of 
his entry into a firm since the work he does then determines 
to a large extent his opportunities for advancement. A be- 
ginning job may require an I.Q. of about 100 but higher 
positions require higher I.Q.’s so that an employee progress- 
ing reasonably well in the initial job may become unfit when 
advanced to the more complex positions. It becomes advisable 
to judge prospective employees not on the basis of the intel- 
ligence required for their first positions but for the positions 
to which they should be able to rise. With the use of objective 
tests, information becomes generally available for an entire 
firm so that transfers and promotions from one department to 
another can be advised with a minimum of further consulta- 
tion, since the qualities required in other work are known and 
the abilities of the employee are likewise known at the time of 
first testing. Apart from the question of job maladjustment 
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there is a further fruitful field for the industrial psychologist 
in the problem of a better supervisor-employee relationship. 
An illustrative study of these methods at work accompanies 
the discussion. J. E. Karlin. 


Johnson, Donald M. and Reynolds, Floyd. “A Factor Analysis 
of Verbal Ability.” Psychological Record, IV (1941), 
183-195. 

The literature on problem solving among animal and 
human subjects suggests that there may be two fundamental 
processes involved: ‘“‘F,” the flow of various acts or responses; 
and “S,” the selection of these responses according to the 
requirements of the problem. This study tested the hypothesis 
that individual differences in these two processes is a major 
determinant for scores on problem-solving tests. This investi- 
gation was limited to verbal problems. There were ten tests 
involving the supplying of verbal responses; the tests varied 
in restriction of choice of responses from complete freedom to 
supply any word to restriction to the supplying of only certain 
words according to a rigid criterion. The subjects were 113 
summer-school students at Fort Hays Kansas State College. 
A centroid analysis of the table of corrected correlation co- 
efficients yielded two factors. The tests fell within a positive 
manifold, after rotation, indicating two definite factors reason- 
ably identified as the “F”’ and “‘S” postulated in the hypothesis. 
It appears that these two factors are probably closely related 
to, if not identical with, Thurstone’s ‘“W” and “V”’ factors. It 
is concluded that the two processes or functions mentioned 
account to a large extent for the variance in verbal problem- 
solving tests. These findings are further discussed with refer- 
ence to tests of vocabulary, intelligence, and reading. J. E. 
Karlin. 












































Kornhauser, A. W. and Schultz, R. S. (et al). “Research on 
Selection of Salesman”’ (and other papers). Journal of 
Applied Psychology, XXV (1941), No. 1, 1-47. 

Five papers read at the Section of Industrial and Business 

Psychology of the American Association for Applied Psychol- 
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ogy in 1940, together with an introductory article, are pre- 
sented in this number, but are not separately abstracted 
because of space limitations. In addition to the introductory 
article (title as above), the authors and titles of the papers 
are as follows: ‘Selection of Casualty and Life Insurance 
Agents” (M. A. Bills) ; “Recent Research in the Selection of 
Life Insurance Salesmen” (A. K. Kurtz); “A Report of Re- 
search on the Selection of Salesmen at the Tremco Manufac. 
turing Company” (O. A. Ohmann); “Procedures for the 
Selection of Salesmen for a Detergent Company” (J. L. Otis) ; 
and ‘Selection Research in a Sales Organization” (T. M. 
Stokes). F. 4. Kingsbury. 


Lowell, Frances E. “A Study of the Variability of I.Q.’s in 
Retests.”’ Journal of Applied Psychology, XXV (1941), 
341-356. 

The main purpose of th's study was to seek corroboratior 
of the results obtained in Cleveland Public Schools in recent 
years which seemed to show that the I.Q.’s of school children 
tended in certain instances to vary between test and retest. 
The data were composed of 1000 cases that had two tests 
only, 1000 cases that had three tests, and 1000 cases that had 
four tests. The Terman 1916 revision of the Binet was used 
in all tests. It was found that there are significant decrements 
in I.Q. both for groups and for chronological age. Further- 
more, the I.Q. range, the chronological age at first test, and 
the interval elapsing between first and last tests may all be 
eliminated as causes for variation in I.Q. on retest. Nor does 
sex influence variations in I.Q. between first and last tests. 
On the average, four times as many cases on retest decrease 
in I.Q. as increase. In particular, those cases that increase 7 
or more points on the first retest decrease 5 times as often as 
they increase on the second retest. The data on the first retest 
seem to indicate that the older the child is, the less chance 
there is that his second I.Q. will increase. J. E. Karlin. 








McCloy, C. H. “The Factor Analysis as a Research Tech- 
nique.” Research Quarterly, XII (1941), 22-33. 
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This paper presents an elementary discussion of some 
fundamental concepts and limitations of factor analysis. Par- 
ticular reference is made to its possible uses in the field of 
health and physical education. Specific examples of precautions 
to be taken and the kind of studies to which this type of 
correlational analysis may be applied are given in terms of 
research in physical education. The method has been utilized 
in (1) studies of motor skills, (2) analysis of anthropometric 
data, (3) analysis of cardiovascular variables, and (4) studies 
of character and personality traits. A 17-item bibliography is 
included. W. A. Varvel. 





Mosier, C. I. “A Psychometric Study of Meaning.” Journai 

of Social Psychology, XIII (1941), 123-140. 

256 adjectives expressing judgmental relationships which 
could be placed along a favorable-neutral-unfavorable con- 
tinuum were rated on an 11-point scale by college students in 
psychology. Some 140 ratings were obtained for each word. 
“Two basic hypotheses . . . . are confirmed: first, that the 
meaning of a word may be considered as if it consisted of two 
parts, one constant and representative of the usual meaning of 
the word, and one variable, representative of individual inter- 
pretation in usage and associated context and general usage; 
second, that the frequency with which any particular meaning 
is evoked is describable by the Gaussian Law.” The presence 
of words with two discrete meanings, yielding bimodal fre- 
quency distributions of responses, was noted. The effect of 
adverbial modifiers on the meaning of an adjective was studied. 
“A scale with a rational basis has been developed and values 
describing quantitatively the modal meaning and the ambiguity 
of more than 200 adjectives have been Gbtained.” W. 4. 
Varvel. 





Oral Trade Tests—Group Leaders’ Handbook. The Per- 
sonnel and Training Section in collaboration with the Local 
Office Operations Section and Chicago Occupational Re- 
search Center. Division of Placement and Unemployment 
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Compensation, 205 West Wacker Drive, Chicago, Illinois. 

February, 1941. 18 pp. 

This handbook is designed to instruct the group-leaders 
in interviewing in the construction and use of Oral Trade 
Tests. It is based on Oral Trade Questions, Vol. I, prepared 
by Occupational Analysis Section, United States Employment 
Service Division (not available for general use). There are 
three divisions: History of Oral Trade Questions; Prepara- 
tion of Oral Trade Questions; and Application of Oral Trade ~ 
Questions in Operating Offices. Specific examples of the appli- 
cation of oral trade questions are given. D. 4. Peterson. 





Osgood, C. E. and Stagner, Ross. “Analysis of a Prestige 

Frame of Reference by a Gradient Technique.” Journal 

of Applied Psychology, XXV (1941), 275-290. 

This study was designed to demonstrate a method for 
analyzing a frame of reference, and to investigate the par- 
ticular determinants of the frame of reference known as occu- 
pational prestige. Subjects were required to judge a number 
of occupational stereotypes with respect to a psychological 
“gradient” continuum varying from the description “brains” 
on the one extreme to “brawn’’ on the other. In a second part 
of the test the judgments were made about persons rather than 
occupations. The subjects were 100 Dartmouth College men, 
students in introductory psychology, 50 of whom filled out the 
“job” form and 50 the “person” form. There were 15 names 
of occupations and each was accompanied by a set of ten 
characteristics. It was found that general rankings for prestige 
correlate on the average highly with median judgments on the 
gradient test, but that the reactions on the job forms were 
significantly different from the person forms. Prestige is 
imputed to occupations per se on the basis of such character- 
istics as hopefulness, being noticed, financial return, brains; 
prestige is imputed to men in specified jobs on the basis of 
brains, leadership, and self-assuredness. Since the conditions 
of the experiment are deemed to exclude the possibility of 
conscious verbalization of a prestige frame of reference, it is 
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concluded that the mere presentation of a set of occupational 
stereotypes for a series of judgments caused the spontaneous 
establishment of a prestige framework which then determined 
in a reliable manner judgments on the specific traits listed. The 
technique is practical and adaptable. J. E. Karlin. 


Powell, N. J. “Check List for Use in Civil Service Objective 
Test Preparation.” Public Personnel Quarterly, II (1941), 
13-16. 

The author has prepared a diagnostic check list designed 
to increase the probability of considering all the major bases 
for appraisal of the test being constructed. Guiding questions 
are listed under each of the construction problems for examin- 
ing the individual item and the test as a whole. A dual criterion 
is suggested and instructions for use of check list are given. 
It is emphasized that while the degree of correlation between 
test score and job performance is important, it is not the only 
indicator of adequacy of examination. D. 4. Peterson. 








Powell, N. J. ‘Steps in Written Test Construction.”’ Public 

Personnel Quarterly, II (1941), 73-76. 

The process of constructing a written test is analyzed, 
assuming that examinations are made public (i.e., a test item 
cannot be used more than once). The following general prob- 
lems are treated in outline form: 1. the determination of the 
abilities to be measured; 2. the determination of the test con- 
tent which measures the desired abilities; 3. the allocation of 
emphasis; 4. the preparation of the test items; 5. the arrange- 
ment and editing of the test items; 6. the experimental tryout; 
7. final test copy; and 8. general considerations with regard to 
test preparation integrity. D. 4. Peterson... 





Reyburn, H. A. and Taylor, J. G. “Some Factors in Intel- 
ligence.” British Journal of Psychology, XXXI (1941), 
249-261. 

This study is intended to throw further light on the con- 
troversy regarding the unitary functioning of a general factor, 

g, in tests of intelligence. The material consisted of ten tests 
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purporting to measure some aspect of intelligence, the tests 
being formboards, repetition of digits, repetition of digits 
backwards, matching tests, absurdities, Porteus mazes, arith- 
metical reasoning, reasoning tests, vocabulary tests, and dis- 
sected sentences. The tests were given to 1497 South African 
children with ages ranging from 12 to 18. Five factors were 
extracted from centroid analysis of the inter-test correlations. 
The axes were then rotated orthogonally so as to preserve 2 
positive manifold and, if possible, retain a general factor 
present in all the tests. It turns out, however, that no general 
factor is present. Three factors are immediate memory span 
(in digits forwards and digits backwards), verbal (in dis- 
sected sentences and vocabulary) and perceptual dexterity (in 
dissected sentences, matching, mazes) ; the two other factors 
are present in equal proportions in matching, arithmetic, and 
reasoning. Neither of these two is g as ordinarily operation- 
ally defined; one factor is the ability to find or make a 
significant pattern in a mass of irrelevant material, and the 
other factor is the ability of logical elimination. The sugges- 
tion is made that g in this battery is complex and that orthodox 
tests of g need to be constructed to preserve its functional 
unity. J. E. Karlin. 
Roff, Merrill. ‘A Statistical Study of the Development of 

Intelligence Test Performance.” Journal of Psychology, 

XI (1941), 371-386. 

Using data available in the literature, correlations between 
test performance of children at a specific age and the gain in 
their performance one or more years later were estimated. 
The fact that the correlations showed no tendency to increase 
as the interval between test and retest increases indicates that 
the “Constancy of the I.Q.” is due primarily to retention of 
earlier skills and knowledge rather than to correlations be- 
tween earlier scores and later increments. On the assumption 
that the I.Q. variability is constant, the same procedures were 
used to find correlations which would result if scores and later 
increments were uncorrelated. No comparison of these values 
and empirical findings is made. Lorraine Bouthilet. 
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Schellhammer, Fred M. “The Intelligence Test in Teacher- 
Training Institutions.” School and Society, LIII (1941), 
319. 

In a survey of 150 teacher-training institutions, 103 were 
found to use the intelligence test in student selection and 
evaluation of intelligence, 18 relying on it solely and 85 using 
it in conjunction with other techniques such as high school 
records and interview and faculty reports, no one combination 
finding universal favor. The majority of institutions con- 
sidered the high school record as important as the intelligence 
test, and were supplementing both measures with subjective 
techniques. Virginia Brown. 





Super, D. E. “A Comparison of the Diagnoses of a Graph- 
ologist with the Results of Psychological Tests.” Journal 

of Consulting Psychology, V (1941), 127-133. 

To check the claims of a woman “graphologist,” 24 stu- 
dents submitted samples of their handwriting and obtained 
the graphologist’s diagnoses. These were compared with the 
most appropriate of several test scores (Intelligence, Fryer 
& Sparling’s Occupational Intelligence Norms, Strong Voca- 
tional Interest, and Bernreuter Personality Inventory). Use 
of chi-square and other methods showed no more than chance 
relationship between occupations recommended and those in- 
dicated as suitable for intelligence scores obtained; occupa- 
tions rated as unsuitable by interest tests were recommended 
with more than chance frequency; personality traits were 
estimated by the graphologist with no more than chance 
agreement with test scores (on four traits), and worse than 
chance agreement (on two traits). F. 4. Kingsbury. 





Thomson, Godfrey. ‘Critical Notice of ‘The Factors of the 
Mind’ by Cyril Burt.” British Journal of Educational 
Psychology, XI (1941), 45-51. 

Thomson writes a brief review of Burt’s most recent book 

(The Factors of the Mind, Univ. of London Press, 1940, xiv 

+ 509). The major portion of the review considers Burt's 
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section on the distribution of temperamental types and the 
application of factor analysis to persons as well as to tests. 
Thomson does not agree that the philosophical approach to 
factor analysis is easier or more illuminating than the geomet- 
rical but he does express agreement with Burt’s conclusions as 
to the metaphysical status of mental factors. W. A. Varvel. 





Traxler, Arthur E. and others. ‘Psychological Tests and 
Their Uses.” Review of Educational Research, XI 
(1941), 1-130. 

This issue, consisting of eight papers not separately 
abstracted because of space limitations, is concerned with the 
construction, evaluation, and application of psychological 
tests. Individual articles are accompanied by extensive bibliog- 
raphies. The following is a list of authors and titles: 

I “Brief Overview of the Period’ (Arthur E. Traxler) 

II “Current Construction and Evaluation of Intelligence 
Tests” (Dewey B. Stuit) 
III ‘Applications of Intelligence Tests” (J. B. Stroud) 
IV ‘Measurement of Aptitudes in Specific Fields” (David 
Segel) 
V “Current Construction and Evaluation of Personality 
and Character Tests” (Arthur E. Traxler) 
VI “Projective Methods in the Study of Personality” 
(Percival M. Symonds) 
VII “Applications of Personality and Character Measure- 
ment” (John W. M. Rothney) 
VIII “Statistical Methods Related to Test Construction and 
Evaluation” (John C. Flanagan) D. 4. Peterson. 





Traxler, Arthur E. “Stability of Scores on the Primary Men- 
tal Abilities Tests.” School and Society, LIII (1941), 
255. 

Test-retest correlations after one year ranging, with one 
exception, from .578 to .917 were found for the scores of 104 
pupils in grades X-XII on Thurstone’s Primary Mental Abili- 
ties Tests. The guidance value of the perceptual, memory, 
and inductive tests may be limited, for their correlations fell 
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below .80. These results should be checked with a larger and 
more representative group as the sampling and number of 
cases were not adequate in the present study. Virginia Brown. 





The Use of Tests in the Illinois State Employment Service. 
The Personnel and Training Section in collaboration with 
the Local Office Operations Section. Division of Place- 
ment and Unemployment Compensation, 205 West Wacker 
Drive, Chicago, Illinois. February, 1941. 11 pp. 

This pamphlet is intended to assist interviewers in the use 
of test results as a supplementary tool in “making more 
objective the evaluations which must be made during the inter- 
view.” The use of tests is related to other interviewer's 
tools (i.e., Job Descriptions, The Dictionary of Occupational 
Titles, Registration and Placements Aids). The article de- 
scribes the types of tests, proficiency and aptitude tests, used 
in aiding the interviewer to evaluate work skills. Aptitude 
test batteries have been developed for three fields: selling, 
clerical work, and manual work. Three graphic illustrations 
of relation of scores on aptitude tests to job performance are 
given. D. A. Peterson. 





Viteles, M. S. “A Psychologist Looks at Job Evaluation.” 

Personnel Journal, XVII (1941), 165-176. 

The author recognizes the importance of job evaluation as 
a basic feature of the industrial relations program. In pro- 
moting the adjustment of workers, there is a need for a pro- 
cedure designed to establish an equitable basis of compensa- 
tion, to facilitate transfer and promotion, and to eliminate 
duplication of activities. The chief consideration of the paper 
is a critical examination of the various types of job evaluation 
programs in the light of psychological principles and experi- 
ence. Ways are indicated in which improvements might be 
effected through the application of the techniques and prin- 
ciples of applied psychology. The present trial and error 
approach could be converted into a “rational, logical, and 
scientific system of analysis.” W. A. Varvel. 
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Welch, Alfred C. ‘An Analytic System of Testing Competi- 
tive Advertising.” Journal of Applied Psychology, XXV 
(1941), 176-189. 

This study is intended to correct the usual copy-test pro- 
cedure, which is defective in that it yields only a gross evalua- 
tion, by combining into a unified system a number of different 
tests which will provide the advertiser with clues to help him 
improve his advertising. An analytic system of testing com- 
petitive advertising was developed to provide a method of 
suggesting specific strong and weak aspects of an advertising 
campaign as well as to provide a gross evaluation of the effects 
of the campaign. The system is based upon four tests: A 
Brand Preference scale (described previously); a Brand 
Familiarity scale (controlled association or aided recall) in 
which the respondents were required to name five brands in 
response to each of two stimulus-words, cigarettes and fountain 
pens; a Theme Familiarity test (Link’s method of triple asso- 
ciates) in which the respondent must identify the sponsor of a 
particular advertising theme; and a Theme Credence test (a 
belief test that does not require the respondent to report 
directly whether he believes an advertising claim). Tests of 
reliability and validity for the various scales indicate that the 
Brand Familiarity, Theme Familiarity, and Theme Credence 
Tests were useful supplements to the Brand Preference scale in 
analyzing the effects of advertising but that none of the three 
tests could be depended upon as a valid measure if used alone. 
Examples of the use of the analytic system are given. J. E. 
Karlin. 
Wells, F. L. ‘Some Functions of Mental Measurements in 

the Young Superior Adult.” Journal of Consulting 

Psychology, V (1941), 105-110. 

A review of cases seen through the psychiatric division 
of the student health department in a large endowed univer- 
sity reveals about ten classes of adjustment problems. These 
are distinguished by different patterns of performance on the 
various standard examination techniques. Representative 
cases and usual trends of each class are described. F. A. 
Kingsbury. 
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